Stat 601, Fall 2000, Class 9




What you need to have learnt from Class 8

*
Two types of model.
*
Parallel lines model: different intercepts - same slopes.
*
Non-parallel lines: different intercepts and different slopes.
*
Two key facts in understanding the JMP output.
*
JMP always makes comparisons to the ``average'' of the groups.
*
JMP always leaves one group out - you figure out the missing difference (easy).
*
Non-parallel slopes, an interaction model.
*
Interaction. A three variable concept (Y,X1,X2). Generic description: the impact of X1 on Y depends on the value of X2.

New material for today: more than two groups.

Example consider three groups (G1,G2,G3).

*
Parallel lines regression - Three of them, one for each group.
*
Key fact: 3 groups, JMP gives 2 comparisons.
*
G1 to average.
*
G2 to average.
*
You work out G3: if G1 is 4 above average and G2 is 3 above average then G3 must be 7 below average.
*
Rule: what number added to the others make them all sum to zero?
*
A negative on a categorical variable coefficient say BELOW par.
*
A positive on a categorical variable coefficient say ABOVE par.
*
Non-parallel lines - 3 different intercepts and three different slopes.
*
Presenting categorical variable regression, an equation for each group. Follows p.194 in the bulk pack.
*
         Baseline: RunTime =  179.59           +  0.23 RunSize
         G1      : RunTime = (179.5 +  22.94)  + (0.23 + 0.07) Runsize
         G2      : RunTime = (179.5 +   6.90)  + (0.23 + -.10) Runsize
         G3      : RunTime = (179.5 + -29.84)  + (0.23 + -.03) Runsize
*
Is a difference significant? Look at the t-stat.
*
Are the differences significant? Look at a partial-F (``effect test'' in JMP).
*
The partial-F
        /    2                 2     \     /Number of 
       |    R        -        R      |    / variables
        \    BIG               SMALL /   / in the subset.
        __________________________________________________________________

         /                      2     \     / Number of observations
        |    1        -        R      |    / minus number of parameters
         \                      BIG   /   / in Big model. (inc. intercept).
*
Strategy for when some groups are significantly different and others are not: collapse the non-significant groups together.
*
More than one categorical variable - fine. (e.g. gender and race). What does a parallel lines regression mean here? Take Y as income and explain it in English.
*
Interaction with more than one variable - fine (see p. 186).

Project discussion

*
What the report should look like?
*
Model building strategies
*
Start with marginal analyses
*
Start small, build up
*
Start large, cut away
*
Force in terms you want to talk about
*
Start with a ``map'' - interesting questions
*
What makes a good model?
*
Do you have a good model?
*
Prediction vs. interpretation
*
Interpreting the regression coefficients
*
The 1/units term
*
Economic considerations - labor, capital and materials
*
Answer questions and provide insight
*
How is the New Plant doing?
*
Who do we learn from for best practices?
*
Is this a labor intensive process?
*
Where are the opportunities to control costs
*
Should this model even be used for giving quotes?
*
Are there key pieces of information missing?
*
Plant or managers?




2000-12-02