Class 12. Time series models.
What you need to have learnt from Class 11: Logistic regression.
- Logistic regression. Modeling transformed probabilities. Which
transform - the logit.
- Can you parse the output? Bulk Pack p.320.
-
-
- 1. The overall test in logistic regression. Is anything going
on, are any (any combination) of the predictors useful in
predicting Y (the logits of the probabilities)? In this case the
small p-value indicates that this
is the case.
- 2. Is a specific coefficient significant (useful) after having
controlled for the other variables in the model. The small p-value
says this is indeed the case.
- 3. What does the 2.82 tell you?
- For every 1 unit change in price diff the logit of the probability
of buying CH changes by 2.82. (controlling for loyal ch and store 7.)
- BETTER. For every one unit (ie a dollar) change in
price diff the
odds of buying CH changes by a multiplicative factor of exp(2.82)
= 16.8.
- Key calculation. At Loyal CH of 0.8, price diff of 20 cents
and product sold in store 7, predict the probability of buying CH?
1. Find the logit. logit = -3.06 + 6.32 * 0.8 + 2.82 * 0.2 + 0.35
= 2.91.
2. Probability = exp(logit)/(1 + exp(logit)) = exp(2.91)/(1 + exp(2.91))
= 0.948.
New material for today: Regression for time series.
- Objective: model a time series.
- Example: Model default rates on mortgages as a function of
interest rates.
- Problem: Time series often have autocorrelated error
terms which violates the standard assumption of independence.
- Definition: Autocorrelation - successive error terms are
dependent (see p.46 of the Bulk Pack).
- Diagnostics.
- Key graphic - residuals plotted against time. Tracking in the
residual plots.
- Look at the Durbin-Watson statistic. Less than 1.5 or over 2.5
suggests a problem.
- Correlation of the residuals is roughly 1 - DW/2.
- Consequences of positive autocorrelation:
- Over-optimistic about the information content in the data.
- Standard errors for slopes too small, confidence intervals too
narrow.
- Think variables are significant when really they are not.
- False sense of precision.
- Fix ups.
- Use differences of both Y and X, not raw data (pp.349-350).
- Include lagged residuals in the model (pp. 334-336).
- Include lag Y in the model (as an X-variable p.358).
- Benefits of differencing.
- Often reduces autocorrelation.
- Can reduce collinearity between X-variables.
Richard Waterman
Fri Oct 18 11:27:42 EDT 1996