No Title

Class 12. Time series models.

What you need to have learnt from Class 11: Logistic regression.

Logistic regression. Modeling transformed probabilities. Which transform - the logit.

Can you parse the output? Bulk Pack p.320.

1. The overall test in logistic regression. Is anything going on, are any (any combination) of the predictors useful in predicting Y (the logits of the probabilities)? In this case the small p-value indicates that this is the case.

2. Is a specific coefficient significant (useful) after having controlled for the other variables in the model. The small p-value says this is indeed the case.

3. What does the 2.82 tell you?

: For every 1 unit change in price diff the logit of the probability of buying CH changes by 2.82. (controlling for loyal ch and store 7.)
: BETTER. For every one unit (ie a dollar) change in price diff the odds of buying CH changes by a multiplicative factor of exp(2.82) = 16.8.

Key calculation. At Loyal CH of 0.8, price diff of 20 cents and product sold in store 7, predict the probability of buying CH?

 1. Find the logit. logit = -3.06 + 6.32 * 0.8 + 2.82 * 0.2 + 0.35
                          = 2.91.

 2. Probability = exp(logit)/(1 + exp(logit)) = exp(2.91)/(1 + exp(2.91))
                = 0.948.

New material for today: Regression for time series.

Objective: model a time series.

Example: Model default rates on mortgages as a function of interest rates.

Problem: Time series often have autocorrelated error terms which violates the standard assumption of independence.

Definition: Autocorrelation - successive error terms are dependent (see p.46 of the Bulk Pack).

Diagnostics.

: Key graphic - residuals plotted against time. Tracking in the residual plots.
: Look at the Durbin-Watson statistic. Less than 1.5 or over 2.5 suggests a problem.
: Correlation of the residuals is roughly 1 - DW/2.

Consequences of positive autocorrelation:

: Over-optimistic about the information content in the data.
: Standard errors for slopes too small, confidence intervals too narrow.
: Think variables are significant when really they are not.
: False sense of precision.

Fix ups.

: Use differences of both Y and X, not raw data (pp.349-350).
: Include lagged residuals in the model (pp. 334-336).
: Include lag Y in the model (as an X-variable p.358).

Benefits of differencing.

: Often reduces autocorrelation.
: Can reduce collinearity between X-variables.

Richard Waterman
Fri Oct 18 11:27:42 EDT 1996