Class 9. Comparing group means.
What you need to have learnt from Class 8: categorical variables
in regression.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Tests: the null hypothesis is always that the differences are
zero, that is no difference between the groups. Three types of test:
(a) Slope or intercept differences non-zero. (b) Slope
differences non-zero? (c) Intercept differences non-zero?
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Are any of the slope or intercept differences non-zero. (i.e.
does adding the categorical variable and its interaction buy us
any explanatory power?). Use the partial-F. You have to calculate
this one yourself, see p. 232 of the BulkPack.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Are any of the slope differences non-zero? Do we need separate
slopes (i.e. do we need an interaction term)? Use the
partial-F as given on the interaction
term in the ``effect test''.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Are any of the intercept differences non-zero? Given we don't
need interaction, do we need separate
intercepts? Use the partial-F as given on the categorical variable
term in the ``effect test'' from a model excluding the interaction.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- When you do a partial-F to compare a BIG model versus a LITTLE
model then the BIG model must include all the variables in the
LITTLE model for the comparison to be valid. (Technical term: the
little model is nested in the big model.)
New material for today: ANOVA.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Objective: compare means (of a Y-variable) across different groups.
Example: Is CEO compensation different between sectors?
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- A single continuous Y-variable and one categorical X-variable.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Recognize: X (the group variable) is categorical.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Conceptually different from regression.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Regression usually has a model building and prediction objective.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- ANOVA has a group comparison objective - no model building.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Two basic questions:
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Are the group means all the same or are some significantly
different? Look in the overall ANOVA table to answer
this. Analysis done from ``Fit Y by X'' button.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- If some are different (first test does not tell you which) use
follow up and refocus question: compare groups to one
another - which ones are significantly different?
Various comparison procedures:
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/blueball.gif)
- Compare each pair, one at a time. BAD.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/blueball.gif)
- Compare all pairs at once. GOOD. Tukey.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/blueball.gif)
- Compare each group with best. GOOD. Hsu.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Critical issue to understand: why is comparing each pair, one
pair at a time BAD? Must read pp. 252-254 in Bulk Pack.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- The procedure which compares each pair, one pair at a time (a
two-sample t-test) fails to take into account the number of
comparisons we are making. If we make a lot of comparisons then
just by chance alone we tend to see something significant. (If we
buy many lottery tickets we tend to win the lottery even though any
single ticket is unlikely to win.) No fishing.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- We want to use a procedure that adjusts for the number of
comparisons that are made and also recognizes that the comparisons
may be data driven. Tukey's and Hsu's do just this. They are
multiple comparison procedures with honest Type I error rates.
(Recall: Type I error - saying there's a difference when really
there is not.)
Honest means that when they declare a 5% error rate, then there is a 5%
chance of one or more errors in the entire set of comparisons
NOT a 5% chance of any particular comparison being wrong.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Multiple comparison procedures achieve honesty by making it
harder to declare a difference significant.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Assumptions: p-values only have credibility if assumptions
hold. Check by graphing residuals.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Independent errors.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Same variance in each group.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Approximately normal.
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/greenball.gif)
- Dealing with JMP output for multiple comparisons. Two choices
- exactly the same conclusions:
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Use graphical output (circle clicking).
![*](http://compstat.wharton.upenn.edu:8001/~waterman/icons/yellowball.gif)
- Use table output (reading numbers).
Richard Waterman
Wed Oct 2 21:48:17 EDT 1996