Class 12. Time series models.

What you need to have learnt from Class 11: Logistic regression.

*
Logistic regression. Modeling transformed probabilities. Which transform - the logit.
*
Can you parse the output? Bulk Pack p.320.
*
external
*
external
*
1. The overall test in logistic regression. Is anything going on, are any (any combination) of the predictors useful in predicting Y (the logits of the probabilities)? In this case the small p-value indicates that this is the case.
*
2. Is a specific coefficient significant (useful) after having controlled for the other variables in the model. The small p-value says this is indeed the case.
*
3. What does the 2.82 tell you?
*
For every 1 unit change in price diff the logit of the probability of buying CH changes by 2.82. (controlling for loyal ch and store 7.)
*
BETTER. For every one unit (ie a dollar) change in price diff the odds of buying CH changes by a multiplicative factor of exp(2.82) = 16.8.

*
Key calculation. At Loyal CH of 0.8, price diff of 20 cents and product sold in store 7, predict the probability of buying CH?
 1. Find the logit. logit = -3.06 + 6.32 * 0.8 + 2.82 * 0.2 + 0.35
                          = 2.91.

 2. Probability = exp(logit)/(1 + exp(logit)) = exp(2.91)/(1 + exp(2.91))
                = 0.948.


New material for today: Regression for time series.

*
Objective: model a time series.
*
Example: Model default rates on mortgages as a function of interest rates.
*
Problem: Time series often have autocorrelated error terms which violates the standard assumption of independence.
*
Definition: Autocorrelation - successive error terms are dependent (see p.46 of the Bulk Pack).
*
Diagnostics.
*
Key graphic - residuals plotted against time. Tracking in the residual plots.
*
Look at the Durbin-Watson statistic. Less than 1.5 or over 2.5 suggests a problem.
*
Correlation of the residuals is roughly 1 - DW/2.

*
Consequences of positive autocorrelation:
*
Over-optimistic about the information content in the data.
*
Standard errors for slopes too small, confidence intervals too narrow.
*
Think variables are significant when really they are not.
*
False sense of precision.

*
Fix ups.
*
Use differences of both Y and X, not raw data (pp.349-350).
*
Include lagged residuals in the model (pp. 334-336).
*
Include lag Y in the model (as an X-variable p.358).

*
Benefits of differencing.
*
Often reduces autocorrelation.
*
Can reduce collinearity between X-variables.



Richard Waterman
Fri Oct 18 11:27:42 EDT 1996