Stat 601, Fall 2000, Class 9

What you need to have learnt from Class 8

Two types of model.
Parallel lines model: different intercepts - same slopes.
Non-parallel lines: different intercepts and different slopes.
Two key facts in understanding the JMP output.
JMP always makes comparisons to the ``average'' of the groups.
JMP always leaves one group out - you figure out the missing difference (easy).
Non-parallel slopes, an interaction model.
Interaction. A three variable concept (Y,X1,X2). Generic description: the impact of X1 on Y depends on the value of X2.

New material for today: more than two groups.

Example consider three groups (G1,G2,G3).

Parallel lines regression - Three of them, one for each group.
Key fact: 3 groups, JMP gives 2 comparisons.
G1 to average.
G2 to average.
You work out G3: if G1 is 4 above average and G2 is 3 above average then G3 must be 7 below average.
Rule: what number added to the others make them all sum to zero?
A negative on a categorical variable coefficient say BELOW par.
A positive on a categorical variable coefficient say ABOVE par.
Non-parallel lines - 3 different intercepts and three different slopes.
Presenting categorical variable regression, an equation for each group. Follows p.194 in the bulk pack.
         Baseline: RunTime =  179.59           +  0.23 RunSize
         G1      : RunTime = (179.5 +  22.94)  + (0.23 + 0.07) Runsize
         G2      : RunTime = (179.5 +   6.90)  + (0.23 + -.10) Runsize
         G3      : RunTime = (179.5 + -29.84)  + (0.23 + -.03) Runsize
Is a difference significant? Look at the t-stat.
Are the differences significant? Look at a partial-F (``effect test'' in JMP).
The partial-F
        /    2                 2     \     /Number of 
       |    R        -        R      |    / variables
        \    BIG               SMALL /   / in the subset.

         /                      2     \     / Number of observations
        |    1        -        R      |    / minus number of parameters
         \                      BIG   /   / in Big model. (inc. intercept).
Strategy for when some groups are significantly different and others are not: collapse the non-significant groups together.
More than one categorical variable - fine. (e.g. gender and race). What does a parallel lines regression mean here? Take Y as income and explain it in English.
Interaction with more than one variable - fine (see p. 186).

Project discussion

What the report should look like?
Model building strategies
Start with marginal analyses
Start small, build up
Start large, cut away
Force in terms you want to talk about
Start with a ``map'' - interesting questions
What makes a good model?
Do you have a good model?
Prediction vs. interpretation
Interpreting the regression coefficients
The 1/units term
Economic considerations - labor, capital and materials
Answer questions and provide insight
How is the New Plant doing?
Who do we learn from for best practices?
Is this a labor intensive process?
Where are the opportunities to control costs
Should this model even be used for giving quotes?
Are there key pieces of information missing?
Plant or managers?
