Skip to content

Instantly share code, notes, and snippets.

@sientifiko
Last active September 28, 2024 00:34
Show Gist options
  • Select an option

  • Save sientifiko/577b9e7dc0033cefedf959d781a3e68b to your computer and use it in GitHub Desktop.

Select an option

Save sientifiko/577b9e7dc0033cefedf959d781a3e68b to your computer and use it in GitHub Desktop.
Código para la nota de medium sobre modelamiento en Ingeniería de Datos
library(tidyverse)
library(haven)
theme_set(theme_bw(base_size = 24))
dat <- read_dta("http://fmwww.bc.edu/ec-p/data/wooldridge/wage1.dta")
dat %>%
ggplot() +
aes(educ, wage) +
geom_jitter() +
scale_x_continuous(expand = c(0, 0)) +
labs(x="Educación en años",
y= "Ingresos")
# Promedios locales
dat2 <- dat %>%
left_join(
dat %>%
group_by(educ) %>%
summarise(avgwage = mean(wage)),
by = "educ"
)
dat2 %>%
ggplot() +
aes(x= educ, y = wage) +
geom_jitter() +
geom_line(aes(y=avgwage), color = "blue", size = 1.2) +
scale_x_continuous(expand = c(0, 0)) +
labs(x="Educación en años",
y= "Ingresos")
# Lineal
lm(dat$wage~dat$educ))
dat2 %>%
ggplot() +
aes(x= educ, y = wage) +
geom_jitter() +
scale_x_continuous(expand = c(0, 0)) +
geom_smooth(method = "lm", formula = "y~x", se=F)+
labs(x="Educación en años",
y= "Ingresos")
# No linealidades
lm(wage~poly(educ, 2), data = dat)
dat2 %>%
ggplot() +
aes(x= educ, y = wage) +
geom_jitter() +
scale_x_continuous(expand = c(0, 0)) +
geom_smooth(method = "lm", formula = "y~poly(x, 2)", se=F)
labs(x="Educación en años",
y= "Ingresos")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment