+ - 0:00:00
Notes for current slide
Notes for next slide



Indicator Variables

Dr. Mine Dogucu

1 / 9

Data babies in openintro package

## Rows: 1,236
## Columns: 8
## $ case <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, …
## $ bwt <int> 120, 113, 128, 123, 108, 136, 138, 132, 120, 143, 140, 144,…
## $ gestation <int> 284, 282, 279, NA, 282, 286, 244, 245, 289, 299, 351, 282, …
## $ parity <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ age <int> 27, 33, 28, 36, 23, 25, 33, 23, 25, 30, 27, 32, 23, 36, 30,…
## $ height <int> 62, 64, 64, 69, 67, 62, 62, 65, 62, 66, 68, 64, 63, 61, 63,…
## $ weight <int> 100, 135, 115, 190, 125, 93, 178, 140, 125, 136, 120, 124, …
## $ smoke <int> 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1,…
2 / 9
y Response Birth weight Numeric
x Explanatory Smoke Categorical
3 / 9

Notation

yi=β0+β1xi+ϵi

β0 is y-intercept
β1 is slope
ϵi is error/residual
i=1,2,...n identifier for each point

4 / 9
model_s <- lm(bwt ~ smoke, data = babies)
tidy(model_s)
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 123. 0.649 190. 0.
## 2 smoke -8.94 1.03 -8.65 1.55e-17
5 / 9
model_s <- lm(bwt ~ smoke, data = babies)
tidy(model_s)
## # A tibble: 2 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 123. 0.649 190. 0.
## 2 smoke -8.94 1.03 -8.65 1.55e-17

y^i=b0+b1xi

bwti^=b0+b1 smokei

bwti^=123+(8.94 smokei)

6 / 9

Expected bwt for a baby with a non-smoker mother

bwti^=123+(8.94 smokei)

bwti^=123+(8.94×0)

bwti^=123

E[bwti|smokei=0]=b0

7 / 9

Expected bwt for a baby with a non-smoker mother

bwti^=123+(8.94 smokei)

bwti^=123+(8.94×0)

bwti^=123

E[bwti|smokei=0]=b0

Expected bwt for a baby with a smoker mother

bwti^=123+(8.94 smokei)

bwti^=123+(8.94×1)

bwti^=114.06

E[bwti|smokei=1]=b0+b1

8 / 9
confint(model_s)
## 2.5 % 97.5 %
## (Intercept) 121.77391 124.320430
## smoke -10.96413 -6.911199

Note that the confidence interval for the "slope" does not contain 0 and all the values in the interval are negative.

9 / 9

Data babies in openintro package

## Rows: 1,236
## Columns: 8
## $ case <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, …
## $ bwt <int> 120, 113, 128, 123, 108, 136, 138, 132, 120, 143, 140, 144,…
## $ gestation <int> 284, 282, 279, NA, 282, 286, 244, 245, 289, 299, 351, 282, …
## $ parity <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ age <int> 27, 33, 28, 36, 23, 25, 33, 23, 25, 30, 27, 32, 23, 36, 30,…
## $ height <int> 62, 64, 64, 69, 67, 62, 62, 65, 62, 66, 68, 64, 63, 61, 63,…
## $ weight <int> 100, 135, 115, 190, 125, 93, 178, 140, 125, 136, 120, 124, …
## $ smoke <int> 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1,…
2 / 9
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow