7 Multiple linear regression
This chapter is a work in progress.
7.1 Learning objectives
- Understand how multiple predictor variables can be used to explain variability in a response variable
- Interpret the model coefficients
- Calculate predicted values from the multiple linear regression model
- Use different types of predictors in the regression model
- Fit and interpret models with a transformed response and/or predicted variable
7.2 Software and packages
library(tidyverse)
(Wickham et al. 2019)library(patchwork)
(Pedersen 2022)library(skimr)
(Waring et al. 2022)library(broom)
(Robinson, Hayes, and Couch 2023)
7.3 Introduction: College alumni pay
Most university graduates get a job within the first few years of graduating college, so a natural question may arise is “what is the expected pay for recent graduates from a university?” In this chapter, we will explore that question by looking at potential characteristics of the college and university that may be associated with variability in the expected pay for recent graduates. The data were originally collected from the PayScale College Salary Report (2024)and were featured as part of the TidyTuesday College tuition, diversity, and pay, weekly data visualization challenge.
The data used in this chapter also includes the United States regions obtained from the state.region
data frame in R.
This analysis will focus on the following variables:
Response
early_career_pay
: Median salary for alumni with 0 - 5 years of work experience (in thousands of US Dollars)
Predictors
region
: Geographic regions in the United States (North Central, Northeast, South, West)type
: Type of institution (Public, Private)stem_percent
: Percent of the institution’s student body pursuing degrees in STEM fields (Science, Technology, Engineering, Mathematics)make_world_better_percent
: Percent of current alumni who say they are making the world a better place.
7.3.1 Exploratory data analysis
The distribution and summary statistics for the response variable early_career_pay
are shown in Figure 7.1 and Table 7.1, respectively. The center of the distribution is around the median of $49,700 and the middle 50% of the distribution has a spread of about $9.500 (IQR). The distribution has a slight right skew, with some institutions have high median early career pay over $80,000.
Now let’s look at the distributions of the primary predictors for this analysis. The visualizations for each variable distribution and summary statistics for the quantitative predictors are in Figure 7.2 and Table 7.2.
Now, let’s conduct bivariate exploratory data analysis in Figure 7.3 and explore the relationship between the response and each of the predictor variables.
From Figure 7.3, we see that there is generally a negative relationship between an institution’s percentage of alumni who say they are making the world a better place and the median early career pay. In contrast, there is generally a positive relationship between the percentage of an institution’s student body pursuing STEM degrees and the median early career pay.
Graduates from institutions in the Northeast region have higher median early career pay compared to students from the other regions. The overlapping boxplots indicate there is generally not much difference in early career pay across the other regions. The median early career pay for students from private institutions is slightly higher than those from public institutions; however, the overlapping boxplots again indicate that this effect may not be strong.
Now that we have a better understanding of the variables in this analysis, we can fit a multiple linear regression model to understand how region, type, percentage of alumni who say they are making the world a better place, and the percentage of students pursuing STEM degrees explains variability in the expected early career pay.
7.4 Fitting the regression line
Thus far, we have examined the individual relationships between the response variable and each of the predictors. Now we will use multiple linear regression to fit a single model that helps us understand these relationships after adjusting for the other predictors.
7.4.1 Statistical model
Recall from Section 4.3 that the general form of a model for a response variable
In the case of simple linear regression,
where
Equation 7.2 is the general form of the model. As with simple linear regression, we know the form of the model describing the relationship between the response and predictor variables (shown in Equation 4.1). If there is a linear relationship between
where
The model
As with simple linear regression, we will use what we know from Equation 7.3 and Equation 7.4 to estimate and conduct inference about the coefficients for the model
7.4.2 Is the MLR appropriate for the data?
Before we move forward with fitting the regression line and obtaining estimates for the coefficients, we should first ask whether the multiple linear regression model is appropriate or useful for the data. We can ask the same questions we did for simple linear regression; however, now we are considering the relationship between
- Will a multiple linear regression model be practically useful? Does quantifying and interpreting the relationship between the variables make sense in this scenario?
- Do the observations in the data represent the population of interest, or are there biases that should be addressed when drawing conclusions from the analysis?
- Is the shape of the relationship between the response and the predictors approximately linear? Does a line reasonably describe the relationship?
The last question can be more challenging to answer for multiple linear regression than it was in the simple linear case, because it is difficult to visualize the relationship between three or more variables. As we saw in Section 7.3.1, we can visualize the relationship between the response and each individual predictor. This can give us some indication of linearity and if there are non-linear shapes that need to be accounted (more on this in later sections ). We will primarily rely on the assessment of the model conditions to ultimately determine whether the model is a useful fit for the data.
7.5 Estimating the coefficients
Recall that when we do linear regression, we are estimating the mean value of the response
The error terms
7.5.1 (Optional) Matrix notation for multiple linear regression
As you may have imagined, the differentiation required to find the values that minimize Equation 7.6 can become complicated, particularly if
where
such that
Similarly, we can rewrite Equation 7.5 as
Applying the rules of matrix algebra and matrix calculus, we find that the estimated coefficients
We can use this formulation of the model to estimate
7.5.2 Estimating coefficient using R
Similar to simple linear regression, we use the lm
function to find the coefficient estimates +
to separate the multiple predictors.
<- lm(early_career_pay ~ stem_percent+ type + region +
alumni_model
make_world_better_percent,data = alumni)
Similarly, we use the tidy
function to restructure the model output into a data frame and kable
to display the results in a neatly formatted table.
tidy(alumni_model) |>
kable(digits = 3)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 50.591 | 1.424 | 35.517 | 0.000 |
stem_percent | 0.297 | 0.014 | 21.251 | 0.000 |
typePublic | -1.832 | 0.427 | -4.293 | 0.000 |
regionNortheast | 5.248 | 0.614 | 8.540 | 0.000 |
regionSouth | -0.661 | 0.523 | -1.263 | 0.207 |
regionWest | 3.885 | 0.680 | 5.710 | 0.000 |
make_world_better_percent | -0.094 | 0.025 | -3.830 | 0.000 |
7.6 Interpreting the coefficients
Now that we have our model, let’s interpret the model coefficients to describe how we expect early career pay to change as these factors change.
7.6.1 Interpreting quantitative predictors
The approach for interpreting quantitative predictors is similar as the approach in Section 4.5. If we have a quantitative predictor
More specifically, the coefficient stem_percent
in Table 7.3.
For each additional percentage of students graduating with a STEM degree, expected early career pay for graduates from an institution is expected to be greater by $297.22, holding all other predictors constant.
Interpret the coefficient make_world_better_percent
in the context of the data.1
Recall from Section 4.5,
“In the context of the data” means
- using the variable names or meaningful descriptions of the variables,
- including the units in the interpretation, and
- Indicating the population for which this model applies.
7.6.2 Interpreting categorical predictors
When we want to include categorical predictors in our model, we need a way to map the qualitative information into quantitative information that can understood by the model and our regression techniques. To do, we create an indicator variable for each level of the categorical predictor. We then use these indicator variables in the regression model.
Suppose we have a categorical variable that has region
in our data.
The variableregion
has four levels: North Central, Northeast, South, West. We can create an indicator variable for each of these levels. From Table 7.4, we see the number of observation at level of the original variable region
and the number of observations that take value 1
for the indicator variables.
Once we have quantified the information from the categorical predictor, we are ready to use the indicator variables in the model. You may notice in Table 7.3, however, that there are only 3 indicator variables (not 4) for region
in the model. When we have
Let’s take a look at the indicator variables for region
in Table 7.3. The baseline level is “North Central”, the region that does not have an indicator in the model. Therefore, the interpretations of the coefficients for the other regions will be how the median early career salary for those regions compare, on average, to those for alumni from institutions in the North Central region.
The coefficient for the Northeast region
. This means that the expected early career salary is about $5,248 higher, on average, for alumni from institutions in the Northeast region compared to those for alumni from institutions in the North Central region, holding all else constant. The coefficient for the South region is
= -0.661. This means that the expected early career salary is about $661 lower, on average, for alumni from institutions in the South region compared to those for alumni from institutions in the North Central region, holding all else constant. The coefficient for the West region
. This means that the expected early career salary is about $3,885 higher, on average, for alumni from institutions in the West region compared to those for alumni from institutions in the North Central region, hold all else constant.
When we have indicator variables in the model, we are shifting the intercept of the model based on the levels of the categorical variable. The overall intercept of the model is the intercept for the baseline level for the categorical predictor. This means the intercepts for region
are the following:
North Central: 50.591
Northeast: 56.019 (50.591 + 5.428)
South: 49.93 (50.591 - 0.661)
West: 54.476 (50.591 + 3.885)
The other categorical predictor in the model in Table 7.3 is type
.
What is the baseline level for
type
?Interpret the effect of
type
in the context of the data.What is the intercept for private institutions? For public institutions?
7.6.3 Interpreting the intercept
As with simple linear regression, the intercept is the expected value of the response when all predictors in the model equal 0. Let’s take a look at what this means for our model in Table 7.3. The intercept is at the point such that
stem_percent = 0
: Zero percent of the students at the institution are studying STEM fields.typePublic = 0
: The institution is private.regionNortheast = 0
andregionSouth = 0
andregionWest = 0
: The institution is in the North Central region.make_world_better_percent = 0
: Zero percent of the students reported wanting a job that makes the world a better place.
Based on the model, the early career pay for students graduating from such an institution is about $50.59.
Putting this together in narrative form, we have
The expected early career pay for students graduating from an private institution in the North Central region that has no STEM graduates and no students who want a job to make the world a better place is about $50,591.13.
From Section 7.5, we see that we need the intercept to obtain the least-squares line. The intercept, however, does not always have a meaningful interpretation in practice. Therefore, as with simple linear regression, the intercept has meaningful interpretation in practice when the following are true:
The scenario described by all predictors in the model being at or near zero simultaneously is plausible in practice.
The data used to fit the model includes observations in which all the predictors are at or near zero simultaneously.
Based on these criteria, do you think the interpretation of the intercept is meaningful for the alumni data? Why or why not?
7.7 Other types of predictors
7.7.1 Centering quantitative variables
As we saw in Section 7.6.3, though we need an intercept to fit the regression model it often times does not have a meaningful interpretation. This is usually because it is not meaningful or realistic for a quantitative variable to take values at or near zero. One way to remedy this is transform a quantitative by shifting it, such that it would be feasible for the transformed variable to take values near zero. This process is called centering.
Centering a variable means to shift every observation of that variable by a single constant,
Suppose we create a new variable stem_percent_cent
that is stem_percent
shifted by 20. The distributions of the original and centered variables are shown in Figure 7.4 and Table 7.5.
From Figure 7.4 and Table 7.5, we see that the only statistics that have changed are those about location (e.g.,
Similarly, in Figure 7.5, both plots have the same slope illustrating that the slope of the relationship between the predictor and response has not changed by centering the predictor. All we have changed is what in means to be “zero”, and therefore observations whose expected response is at the intercept. This is illustrated by the different intercepts for the models shown in the two plots.
A common process of centering is mean-centering, in which all the quantitative predictors in the model are shifted by their respective mean. For the model in Table 7.3, this means transforming stem_percent
and make_world_better_percent
by shifting them by their respective means.
Note that we do not mean-center the categorical predictors type
and region
, because there is not a realistic notion of the “average region”. Additionally, there is already a meaningful interpretation for all predictors associated with region
being equal to zero.
The output for the model with the mean-centered variables is in Table 7.6 .
<- alumni |>
alumni mutate(stem_cent = stem_percent - mean(stem_percent),
make_world_better_cent = make_world_better_percent - mean(make_world_better_percent, na.rm = TRUE)
)
<- lm(early_career_pay ~ stem_cent + type + region +
alumni_model_cent
make_world_better_cent,data = alumni)
tidy(alumni_model_cent) |>
kable(digits = 3)
<- tidy(alumni_model_cent) |> select(estimate) |> slice(1) |> pull() alumni_cent_intercept
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 50.465 | 0.415 | 121.460 | 0.000 |
stem_cent | 0.297 | 0.014 | 21.251 | 0.000 |
typePublic | -1.832 | 0.427 | -4.293 | 0.000 |
regionNortheast | 5.248 | 0.614 | 8.540 | 0.000 |
regionSouth | -0.661 | 0.523 | -1.263 | 0.207 |
regionWest | 3.885 | 0.680 | 5.710 | 0.000 |
make_world_better_cent | -0.094 | 0.025 | -3.830 | 0.000 |
A comparison of the model coefficients from the original model and the model with the mean-centered values for the quantitative predictors is shown in Table 7.7. In what ways are the two models the same? In what ways are they different?
Term | Original | Mean-Centered |
---|---|---|
(Intercept) | 50.591 | 50.465 |
stem_percent | 0.297 | 0.297 |
typePublic | -1.832 | -1.832 |
regionNortheast | 5.248 | 5.248 |
regionSouth | -0.661 | -0.661 |
regionWest | 3.885 | 3.885 |
make_world_better_percent | -0.094 | -0.094 |
Note that the only term that changed between the two models is the intercept; the coefficients for all other predictors remained the same. This shows that centering a predictor, by shifting by the mean or some other constant does not change its relationship with the response variable. With this in mind, let’s interpret the intercept for the model with mean-centered variables.
As before, the intercept is the expected response, early career pay, when all the predictors are equal to 0. Thus, the intercept is the expected early career pay when
stem_percent_cent = 0
: The percent of the students at the institution studying STEM fields is equal to the mean 16.599.typePublic = 0
: The institution is private.regionNortheast = 0
andregionSouth = 0
andregionWest = 0
: The institution is in the North Central region.make_world_better_percent = 0
: The percent of the students who reported wanting a job that makes the world a better place is equal to the mean 53.63.
Before we write the full interpretation, let’s see why stem_percent
and make_world_better_percent
are at the mean values when we interpret the intercept.
The same holds true for make_world_better_percent
. Thus, the interpretation of the intercept in narrative form when the quantitative predictors are mean-centered is
The expected early career pay for students graduating from an private institution in the North Central region that has 16.599% STEM graduates and NA % students who want a job job to make the world a better place is about $50,465.48.
Because the coefficients for the other predictors are the same, the interpretations for these predictors are the same as in Section 7.6.1 and Section 7.6.2.
7.7.2 Standardizing quantitative variables
Thus far, we have discussed interpretation of the coefficients to understand the relationship between the response and each predictor variable. Sometimes, however, our primary objective of the analysis may be less focused on interpretation and more focused on which predictors have the strongest relationship with the response variable. When that is the case, we can fit a model using standardized values of the quantitative predictors.
- Add something about why we can’t compare across scales for original variables.
Standardizing a variable means shifting every observation of that variable by the mean of the variable (as in mean-centering) and dividing by the standard deviation of that variable. Thus, given a quantitative predictor
where
Unlike centering, standardizing a variable changes its units, since it is rescaled by the standard deviation. Therefore, the units are no longer the original units but rather the number of standard deviations an observation is away from the mean value. Because the units have changed, the value (but not the sign!) of the coefficient for standardized variables will change as well. Table 7.8 shows the output of the model using standardized coefficients for stem_percent
and make_world_better_percent
. Similar to centering, we only standardize quantitative predictors, as there is no meaningful notion of “average” or “standard deviation” for categorical predictors like region
or type
.
<- alumni |>
alumni mutate(stem_std = (stem_percent - mean(stem_percent))/sd(stem_percent),
make_world_better_std = (make_world_better_percent - mean(make_world_better_percent, na.rm = TRUE)) / sd(make_world_better_percent, na.rm = TRUE)
)
<- lm(early_career_pay ~ stem_std + type + region +
alumni_model_std
make_world_better_std,data = alumni)
tidy(alumni_model_std) |>
kable(digits = 3)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 50.465 | 0.415 | 121.460 | 0.000 |
stem_std | 4.586 | 0.216 | 21.251 | 0.000 |
typePublic | -1.832 | 0.427 | -4.293 | 0.000 |
regionNortheast | 5.248 | 0.614 | 8.540 | 0.000 |
regionSouth | -0.661 | 0.523 | -1.263 | 0.207 |
regionWest | 3.885 | 0.680 | 5.710 | 0.000 |
make_world_better_std | -0.830 | 0.217 | -3.830 | 0.000 |
Now let’s compare the model with standardized quantitative predictors to the original model and the model using mean-centered quantitative predictors. Table 7.9 shows the comparison of these models.
Term | Original | Mean-Centered | Standardized |
---|---|---|---|
(Intercept) | 50.591 | 50.465 | 50.465 |
stem_percent | 0.297 | 0.297 | 4.586 |
typePublic | -1.832 | -1.832 | -1.832 |
regionNortheast | 5.248 | 5.248 | 5.248 |
regionSouth | -0.661 | -0.661 | -0.661 |
regionWest | 3.885 | 3.885 | 3.885 |
make_world_better_percent | -0.094 | -0.094 | -0.830 |
The intercept for the model using mean-centered quantitative predictors is the same as the intercept for the model using standardized quantitative predictors. Why do these models have the same intercept?2
The interpretation for the standardized predictors is in terms of a one standard deviation (one “unit”) increase in the predictor. Below is the interpretation for stem_percent
.
For every one standard deviation in the percentage of STEM degrees, the early career pay for alumni is expected to increase by $4,587, holding all else constant.
Though this interpretation is technically correct, it is not meaningful to the reader to merely say a “one standard deviation increase”. Therefore, we can use the exploratory data analysis in Table 7.1 to be more specific about how much a one standard deviation increase in the percentage of STEM degrees is and make this interpretation more meaningful.
For every 15.428 percentage point increase in STEM degrees, the early career pay for alumni is expected to increase by $4,587, holding all else constant.
We often standardize predictors when we want to look at the variables that have the strongest relationship with the response variable. Because all the quantitative predictors are now on the same scale, we can look at the magnitude of the coefficients to rank the predictors in terms of importance. From the coefficients in Table 7.8, we see that regioNortheast
has the strongest relationship with the response, early career pay, because it has the coefficient with the highest magnitude, i.e., the highest absolute value.
7.8 Higher-order polynomial terms
Sometimes the relationship between the response and one or more predictors cannot be adequately described using a model of the form
Therefore, we may consider including polynomial terms or non-linear terms in the model to better account for the relationship between the response and predictors. When doing a regression analysis, we can use the exploratory data analysis to gain initial intuition about the relationship between the response and each predictor variable. While the exploratory data analysis is useful for intuition, it is often when examining the residual plots that we observe the need for such terms in the model.
When we fit a model of the form Equation 7.9, we assume that the relationship between the response variable
As we see here, we can find the equation of the least-squares line even if such a line is not a good fit for the data. As with all linear regression analysis, we want to use the residuals to assess the LINE model assumptions discussed in [INSERT REF]. For now, let’s look at a plot of residuals versus fitted values to assess the linearity condition.
From Figure 7.7, we see a clear violation of the linearity assumption. Recall the question from [INSERT REF]:
Based on the plot, can you generally determine with great certainty whether the residual would be positive or negative for a given fitted value or range of fitted values?
The answer to this question is yes. We can be fairly certain the residual will be positive for fitted values that are less than 0 or greater than 40. We can also be fairly certain that the residual will be negative for fitted values between 20 and 30 (though this is not always the case). Thus the linearity condition is not satisfied.
Even more than determining the linearity condition is violated, we can use the residual plot to determine the shape of the relationship between the response and predictor that isn’t being captured by the linear model. Recall, that the residuals should merely capture random error, so ideally there would be no pattern in the plot of residuals versus fitted values. In Figure 7.7, we see quadratic (“U-shaped”) relationship, indicating there is still some systematic relationship between the response and predictor that must be accounted for. Therefore, we need to add a quadratic term
Equation 7.10 shows the form of the linear regression model that includes a quadratic term. From Figure 7.8, we see that a model of this form is a much better fit of the relationship between
The residual plot in Figure 7.9 shows that the linearity condition is now satisfied, and the residuals now represent the random error and no remaining systematic relationship between the response and predictor.
Now that we have established how quadratic effects can help with model fit, let’s discuss how to interpret the model with the quadratic terms. ?eq-quad-example-sqterm is the equation of the model shown in Figure 7.8.
Now that we have the quadratic term in the model, we will interpret the full effect of the predictor
This interpretation doesn’t quite give the full picture about the change in
As
increases from to , is expected to change by , on average.
Based on Figure 7.8, we will interpret the change in
When
increases from -7 to -5, is expected to decrease by 27.344 , on average. When
increases from -5 to 5, is expected to decrease by 4.36 , on average.
Describe the change in
If you are unsure about the ranges for interpretation, one strategy is to interpret the change in
In practice, we often want to avoid polynomial terms in the model that are higher than
If we do add higher-order polynomial terms to the model, then we interpret them using the same approach as with quadratic terms. For example, suppose we fit the model
The interpretation when
As
7.9 Log-transformations
7.10 Interaction effects
- Use interaction effects when one predictor’s effect on the response differs based on values of another predictor variable. There are three possible types of interaction terms that we will discuss in detail:
Categorical X quantitative
This is the scenario in which the effect of the quantitative predictor differs for different categories of a predictor variable
Show a graph of the line - the non-parallel slopes indicate a potential interaction effect
Interpretation of the actual interaction term
In practice we may just talk about the effect of a particular variable for different levels of the categorical predictor
Categorical X categorical
This is similar to the previous scenario, but in the scenario in which a value of a categorical predictor differs based on values of a different categorical predictor.
Because we have two categorical predictors, you can interchange how you interpret them.
- One way to decide how to interpret them is to figure out which variable is the primary one of interest. Then you can interpret the relationship between that variable and the response for varying levels of the other predictor.
You can visualize this on a segmented barplot or a line plot?
Quantitative X quantitative
This is a less common type of interaction but can be used in certain contexts, such as in physics.
Similar to categorical X categorical in that you can interchange the interpretation
- You can determine which variable is the primary one of interest and interpret the interaction effect based on varying values of the other predictor
When you interpret this, it is best to identify particular points - maybe a high, medium, and low value of one predictor and interpret the other predictor’s effect on the response.
Show what this looks like visually - and interpret the effects
7.11 Prediction
Prediction is similar to simple linear regression
you plug in the values and get the expected value of the response
Show an example
Your turn: find the predicted value for [insert combination of the predictors]
must be careful about extrapolation. given you are working in a lot of different dimensions, there are a lot of different ways you can extrapolate. so you want to be especially mindful not to extrapolate in terms of any single predictor or combination of predictors
there is variability in the prediction, which we will discuss in the chapter about inference.
library(tidyverse)
library(tidymodels)
<- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-08-24/lemur_data.csv')
lemurs
# filter for just three species (maybe even just one species) adjults, and take the latest weight measurement for each lemur
set.seed(51325)
<- lemurs |>
lemurs_samp filter(taxon %in% c("PCOQ", "ERUF", "VRUB"), sex != "ND", age_at_wt_mo > 0) |> #only 1 obs in final sample with sex = ND
group_by(dlc_id) |>
sample_n(1) |>
ungroup()
<- lm(weight_g ~ taxon + sex + age_at_wt_mo + litter_size, data = lemurs_samp)
m
<- lm(weight_g ~ taxon + sex +scale(age_at_wt_mo, scale = FALSE) + scale(litter_size, scale = FALSE) , data = lemurs_samp)
m2
<- lm(weight_g ~ taxon + sex +scale(age_at_wt_mo) + scale(litter_size) , data = lemurs_samp)
m3
<- lm(weight_g ~ taxon + sex + age_at_wt_mo + litter_size + taxon * age_at_wt_mo , data = lemurs_samp)
m4
<- lm(weight_g ~ taxon + sex + age_at_wt_mo + litter_size + taxon * sex , data = lemurs_samp)
m5
<- lm(weight_g ~ taxon + sex + age_at_wt_mo + litter_size + age_at_wt_mo * litter_size, data = lemurs_samp)
m6
<- lm(weight_g ~ taxon + sex + age_at_wt_mo + litter_size + age_at_wt_mo + I(age_at_wt_mo^2), data = lemurs_samp)
m7
<- lm(weight_g ~ taxon + sex + age_at_wt_mo + litter_size + log(age_at_wt_mo), data = lemurs_samp)
m8
tidy(m, conf.int = TRUE)
# A tibble: 6 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1215. 223. 5.46 1.05e- 7 777. 1654.
2 taxonPCOQ 640. 180. 3.56 4.41e- 4 286. 995.
3 taxonVRUB 1394. 265. 5.26 2.87e- 7 872. 1916.
4 sexM -217. 139. -1.56 1.20e- 1 -491. 56.6
5 age_at_wt_mo 11.5 1.11 10.3 1.81e-21 9.28 13.7
6 litter_size -108. 138. -0.782 4.35e- 1 -381. 164.
tidy(m2, conf.int = TRUE)
# A tibble: 6 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1727. 161. 10.8 6.96e-23 1411. 2043.
2 taxonPCOQ 640. 180. 3.56 4.41e- 4 286. 995.
3 taxonVRUB 1394. 265. 5.26 2.87e- 7 872. 1916.
4 sexM -217. 139. -1.56 1.20e- 1 -491. 56.6
5 scale(age_at_wt_mo, … 11.5 1.11 10.3 1.81e-21 9.28 13.7
6 scale(litter_size, s… -108. 138. -0.782 4.35e- 1 -381. 164.
tidy(m3, conf.int = TRUE)
# A tibble: 6 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1727. 161. 10.8 6.96e-23 1411. 2043.
2 taxonPCOQ 640. 180. 3.56 4.41e- 4 286. 995.
3 taxonVRUB 1394. 265. 5.26 2.87e- 7 872. 1916.
4 sexM -217. 139. -1.56 1.20e- 1 -491. 56.6
5 scale(age_at_wt_mo) 899. 87.0 10.3 1.81e-21 727. 1070.
6 scale(litter_size) -97.0 124. -0.782 4.35e- 1 -341. 147.
tidy(m4, conf.int = TRUE)
# A tibble: 8 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1427. 229. 6.23 1.66e-9 976. 1878.
2 taxonPCOQ 162. 212. 0.767 4.44e-1 -254. 579.
3 taxonVRUB 1401. 289. 4.85 2.05e-6 832. 1969.
4 sexM -277. 134. -2.06 3.99e-2 -542. -12.9
5 age_at_wt_mo 8.29 1.84 4.49 1.02e-5 4.66 11.9
6 litter_size -117. 133. -0.877 3.81e-1 -379. 145.
7 taxonPCOQ:age_at_wt_mo 12.0 2.74 4.36 1.78e-5 6.56 17.3
8 taxonVRUB:age_at_wt_mo -0.242 2.52 -0.0961 9.24e-1 -5.19 4.71
tidy(m5, conf.int = TRUE)
# A tibble: 8 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1251. 242. 5.18 4.26e- 7 775. 1726.
2 taxonPCOQ 575. 241. 2.38 1.78e- 2 100. 1050.
3 taxonVRUB 1358. 312. 4.35 1.88e- 5 744. 1972.
4 sexM -301. 270. -1.11 2.66e- 1 -832. 230.
5 age_at_wt_mo 11.5 1.12 10.2 3.92e-21 9.26 13.7
6 litter_size -108. 139. -0.779 4.37e- 1 -382. 165.
7 taxonPCOQ:sexM 147. 359. 0.410 6.82e- 1 -560. 854.
8 taxonVRUB:sexM 85.6 350. 0.244 8.07e- 1 -604. 775.
tidy(m6, conf.int = TRUE)
# A tibble: 7 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1021. 241. 4.25 2.96e- 5 548. 1495.
2 taxonPCOQ 673. 180. 3.74 2.21e- 4 319. 1027.
3 taxonVRUB 1429. 264. 5.41 1.35e- 7 909. 1948.
4 sexM -230. 138. -1.66 9.72e- 2 -503. 42.1
5 age_at_wt_mo 15.8 2.36 6.68 1.24e-10 11.1 20.4
6 litter_size 0.431 147. 0.00292 9.98e- 1 -290. 291.
7 age_at_wt_mo:litter_… -2.60 1.26 -2.07 3.97e- 2 -5.08 -0.123
tidy(m7, conf.int = TRUE)
# A tibble: 7 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 633. 188. 3.36 8.86e- 4 262. 1003.
2 taxonPCOQ 780. 148. 5.28 2.52e- 7 489. 1070.
3 taxonVRUB 1329. 217. 6.14 2.83e- 9 903. 1755.
4 sexM -230. 114. -2.03 4.34e- 2 -454. -6.88
5 age_at_wt_mo 41.1 2.64 15.6 4.93e-40 35.9 46.3
6 litter_size -79.4 113. -0.702 4.83e- 1 -302. 143.
7 I(age_at_wt_mo^2) -0.139 0.0116 -12.0 5.03e-27 -0.162 -0.116
tidy(m8, conf.int = TRUE)
# A tibble: 7 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 390. 134. 2.90 3.97e- 3 126. 654.
2 taxonPCOQ 780. 105. 7.44 1.24e-12 574. 987.
3 taxonVRUB 1103. 155. 7.13 8.21e-12 799. 1408.
4 sexM -94.9 81.0 -1.17 2.43e- 1 -254. 64.6
5 age_at_wt_mo -3.81 0.914 -4.17 4.01e- 5 -5.61 -2.01
6 litter_size -84.1 80.6 -1.04 2.97e- 1 -243. 74.5
7 log(age_at_wt_mo) 616. 26.1 23.6 5.36e-69 565. 668.
ggplot(data = lemurs_samp, aes(x = age_at_wt_mo, y = weight_g, color = taxon)) +
geom_point()
For each additional percentage of students who want a job that makes the world a better place, expected early career pay for graduates from an institution is expected to be lower by $94.33, holding all other predictors constant.↩︎
The intercept means the expected value of the response when all predictors are equal to 0. Given a quantitative predictor,
, the mean-centered value of is and the standardized value of is . Both of those are equal to 0, when , i.e., is equal to the mean.↩︎The variable
make_world_better_percent
, because it has the coefficient with the smallest magnitude, |-0.094| = 0.094.↩︎When
increases from 5 to 10, is expected to increase by 80.545 , on average.↩︎