Midsemester survey results

Time spent outside of class

plot of chunk unnamed-chunk-1

How challenging?

plot of chunk unnamed-chunk-2

One thing you like

  • Integration of computing (Learning R)
  • Group activities
  • Slides online

One thing you'd like to change

  • Have homework due Sunday night.
  • Have homework due Sunday night.
  • Have homework due Sunday night.
  • Have homework due Sunday night.

All homeworks are now due Sunday 11:55 pm

  • Post slides earlier.

Activity 7 Part II

  • Revisit the twins data set from the quiz.
  • Is there evidence that the relationship between IQs differs between the social status groups (intercepts or slopes)?

Model 1: slr

plot of chunk unnamed-chunk-3

Model 1: slr

summary(m1)
## 
## Call:
## lm(formula = Foster ~ Biological, data = twins)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.351  -5.731   0.057   4.324  16.353 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   9.2076     9.2999    0.99     0.33    
## Biological    0.9014     0.0963    9.36  1.2e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.73 on 25 degrees of freedom
## Multiple R-squared:  0.778,  Adjusted R-squared:  0.769 
## F-statistic: 87.6 on 1 and 25 DF,  p-value: 1.2e-09

Model 2: parallel lines

plot of chunk unnamed-chunk-5

Model 2: parallel lines

summary(m2)
## 
## Call:
## lm(formula = Foster ~ Biological + Social, data = twins)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -14.823  -5.237  -0.111   4.476  13.698 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -0.608     11.855   -0.05     0.96    
## Biological      0.966      0.107    9.03    5e-09 ***
## Sociallow       6.226      3.917    1.59     0.13    
## Socialmiddle    2.035      4.591    0.44     0.66    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.57 on 23 degrees of freedom
## Multiple R-squared:  0.804,  Adjusted R-squared:  0.778 
## F-statistic: 31.4 on 3 and 23 DF,  p-value: 2.6e-08

Model 3: 2 intercepts, 2 slopes

plot of chunk unnamed-chunk-7

Model 3: 2 intercepts, 2 slopes

summary(m3)
## 
## Call:
## lm(formula = Foster ~ Biological * Social, data = twins)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -14.480  -5.248  -0.155   4.582  13.798 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              -1.8720    17.8083   -0.11     0.92    
## Biological                0.9776     0.1632    5.99    6e-06 ***
## Sociallow                 9.0767    24.4487    0.37     0.71    
## Socialmiddle              2.6881    31.6042    0.09     0.93    
## Biological:Sociallow     -0.0291     0.2446   -0.12     0.91    
## Biological:Socialmiddle  -0.0050     0.3295   -0.02     0.99    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.92 on 21 degrees of freedom
## Multiple R-squared:  0.804,  Adjusted R-squared:  0.757 
## F-statistic: 17.2 on 5 and 21 DF,  p-value: 8.31e-07

Conclusions

  • Although we find that in this data set the intercepts decrease with increasing class, this could just be due to sampling variability and not indicative of an actual difference. I.e. \(\hat{\beta}_2\) and \(\hat{\beta}_3\) are not statistically significant.

  • This data set provides no strong indication that the slopes differ between the social groups.

  • This data is best described with a simple linear regression model.

\[ \widehat{Foster} = \hat{\beta}_0 + \hat{\beta}_1 Biological \]

Some notes

  • Interaction terms can be added whole-hog using Foster ~ Biological * Social or individually with Foster ~ Biological + Social + Biological:Social.
  • Categorical predictor (like Social) with \(n\) levels get separated by R out into \(n - 1\) dummy variables that that 1 if that condition is met and 0 if not.
  • These dummy variables are the estimated difference between the intercept of that level and that of the reference level.
  • An interaction term between a continuous predictor and a dummy variable (such as Biological:Sociallow) is the estimated difference between the slope for that continuous variable for that level, and the slope for the reference level.
  • Different reference levels can lead to differing p-values because you're making different comparisons.