Statistics 621
First Quarter, 2001
Class lecture notes
This outline indicates the topics and data sets
that will be covered in class.
- Fitting equations to data
- Assumptions in regression
(9 Sep 2001)
- Prediction and confidence intervals
(11 Sep 2001, small edits)
We will use this JMP-IN script to
explore how outliers affect a regression model. For
additional discussion of logs in regression, I have an extra handout with more
examples.
- Multiple regression
(16 Sep 2001)
- More multiple regression
(17 Sep 2001)
For some extra examples on interpreting multiple regression,
we will use these handouts (I will distribute a copy in class):
The data sets for these examples are
We will then continue with the car data.
- Collinearity in regression
(20 Sep 2001)
This class completes our discussion of collinearity in
multiple regression, focusing on diagnostics and possible
remedies. Alas, a scheduled fire drill will interrupt our
coverage of this material and hold things back. For the
assignment, be sure to have a look at the discussion of the
partial F test in the casebook, pages 151-152 in the
parcel handling example. The illustration of partial F in
the regression for fuel consumption (car89) is complicated by
the presence of missing data for some predictors.
- Categorical predictors in regression
(revised, 24 Sep 2001)
- Categorical predictors
(26 Sep 2001)
- Categorical terms with many levels
(revised, 2 Oct 2001)
We will also use this
FedEx example
to review material from the previous class. It illustrates a
two-group example with an interaction. Time permitting, we
will begin our discussion of the model-building process.
- Building regression models
(3 Oct 2001)
We will conclude our analysis of the use of categorical factors
in multiple regression with a quick look at a topic discussed back
in Stat 603 - namely the issue of multiple comparisons .
Most of our time will be spent on the modeling process.
- Diagnostics for regression models
(9 Oct 2001)
This lecture concludes our discussion of the project data and multiple
regression models. We will discuss various types of residual diagnostics,
such as those that indicate that you have left out a factor from the
model. We will also take a look at some
sample executive summaries
from last year's project.
- Analysis of variance
(preliminary, 10 Oct 2001)
This lecture concludes the course with a look at a different approach
to regression, one based on data gathered in highly structured
experiments. All of the predictors are categorical, and interaction
becomes yet more interesting and revealing. The methods are closely
related to conjoint analysis as used in marketing. The two examples from
the casebook illustrate the analysis both with and without an
interaction.
Data sets from the casebooks
You can get the data for the casebook examples either as a single zip file
or as individual JMP files from this
link.
If you take this route, note that the assignments in this
zip file are for those in the casebook, not those
that we are using this term.
