# Regression Models

##### Executive Summary

Using the 1974 Motor Trend data we look into if shifting using manual transmission actually gets better mpg as is commonly believed. By comparing the average mpg for each transmission type, we find that manuals get 7.24 more miles per gallon. However once we account for many of the additional variables that contribute to mpg, the mpg advantage of manual transmissions shrinks to an average of 2.94 more mpg. Still, we are able to confirm having a manual transmission will get you better mpg with a confidence level of 95.33%.

##### Exploritory analysis

By making a box plot of MPH and transmission (Fig 1) we can see that at first glance manuals tend to have better MPG. We can then make a matrix of plots to see how the values we are interested in interact (Fig 2).

```
## Source: local data frame [2 x 2]
##
## am mean(mpg)
## 1 0 17.14737
## 2 1 24.39231
```

By taking the average mpg for both automatic transmission and manual transmission we can see that the average manual has 7.24 higher miles per gallon.

`## [1] "p-value: 0.00137363833307103"`

We then run a t-test of the two and reject the null hypothesis with a p-value of .00137, showing that the difference between transmissions is significant.

##### Fitting Models

```
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.147368 1.124603 15.247492 1.133983e-15
## factor(am)1 7.244939 1.764422 4.106127 2.850207e-04
```

`## [1] "r squared: 0.359798943425465"`

By fitting a model for just the the mpg predicted by the transmission type, we see that it does offer a statistically significant improvement from just average mpg. However this only explains 36% of the variance so we canâ€™t confidently draw any conclusions from it.

```
## mpg wt cyl disp hp drat vs
## 1.0000000 0.8676594 0.8521620 0.8475514 0.7761684 0.6811719 0.6640389
## am carb gear qsec
## 0.5998324 0.5509251 0.4802848 0.4186840
```

By looking at the absolute values of MPG correlations we can see that wt, cyl, disp, and hp are all have strong correlations with mpg. While we can walk though these values and build a pretty good model (Fig 4), we lose the significance of transmissions which is what we are interested in.

Luckily we can run a function to quickly go through the confounders and pick the best model, which happens to show that transmission is significant:

`## lm(formula = mpg ~ wt + qsec + am, data = mtcars)`

`## [1] "r squared: 0.849663556361707"`

By looking at the full correlation grid (Fig 4, visualized: Fig 2) we can see that many of the variables strongly affect one another so it makes sense that qsec is able to represent many of the variables in our data. With this model allowing us to address transmission, explaining 85% of variation is great.

##### Residual Plot and Diagnostics

`sort(hatvalues(slml),decreasing = TRUE)[1:5]`

```
## Merc 230 Lincoln Continental Chrysler Imperial
## 0.2970422 0.2642151 0.2296338
## Cadillac Fleetwood Maserati Bora
## 0.2270069 0.1909815
```

We check the leverage of our data by looking at the highest hat values and see that we have a couple of cars with over two times the mean hat value of .125. However when we look at Fig 5, they donâ€™t appear to be anything to worry about. We can also see that there are not any obvious trends in the fitted vs residual or leverage plots, and the Normal Q-Q plot shows that the residuals are fairly normally distributed.

##### Conclusions

```
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.617781 6.9595930 1.381946 1.779152e-01
## wt -3.916504 0.7112016 -5.506882 6.952711e-06
## qsec 1.225886 0.2886696 4.246676 2.161737e-04
## am 2.935837 1.4109045 2.080819 4.671551e-02
```

While we have some uncertainty since there are not identical models with each transmission, we can feel fairly confident in our conclusions since our model explains 85% of the variability. With p-value .0467 < .05 we reject the null hypothesis and accept that manual transmissions have an average of 2.94 better mpg than automatic transmissions.

# Appendix

Fig 1:

Fig 2: