Introduction to Time Series Analysis

Part 2 of Insurance 260

Announcements
• Planned syllabus includes requirements, expectations and details of the Bowerman textbook.

• The TA for the course, Najah, is available on Wednesday afternoons from 3-4:30 in JMHH F96 (which also holds Stat-Lab). You can get help from others at Stat-Lab during drop-in hours, though the TA may not be familiar with specifics of this course. You can check the hours that Stat-Lab operates by following this link to Stat-Lab.

• If you're rusty with how to go about the dianostics of a regression model, skim through Chapter 5 of Bowerman. We're not covering all of these details (you have seen them before), but the book provides a very good review. In particular, take a look at
• Section 5.1: Review of collinearity (VIF, adjusted R2)
• Section 5.2-5.3: Checking for autocorrelation (albeit without Durbin Watson, which comes in Chapter 6), equal variance, normality (normal quantile/probability plots)

• Assignment summary statistics
1. Mean 8.6, with SD = 1.1
2. Mean 9.4, with SD = 0.8
3. Mean 9.2, with SD = 1.8
4. Mean 9.1, with SD = 2.4
The mean on the final (as indicated in class) was about 70 (67.5) with SD 14. The high score was 88 and the low was below 50. You can see the solutions for the exam here.

• JMP Tips
• In order to obtain levels of confidence other than 0.95 when saving prediction intervals and confidence intervals (from Fit Model), press the *shift key* when you select the command to save the interval. A dialog will appear allowing you to change the alpha level from 0.05 to a different value. From the window that shows the results from Fit Model:
*shift* > red triangle > Save Columns > Mean Confidence Interval
or
*shift* > red triangle > Save Columns > Individual Confidence Interval
Press *shift* before you click on the triangle pop-up menu.

Lecture notes
Lecture notes that are posted here will outline the material for each class, but are not comprehensive in their coverage of the materials. For that, you will need to read the textbook, attend lectures, and complete homework assignments. (The versions of the slides that do not have backgrounds are much better for printing.)
1. Simple regression model ( without background )

2. Multiple regression model ( without background )

3. Multiple regression with categorical explantory variables ( without background )

4. Regression models for time series ( without background )

5. Regression models for time series, second example ( without background )

6. Exponential smoothing ( without background )

7. Autoregressive, moving average models ( without background )

8. Indentifying and estimating ARMA models ( without background )

9. Forecasting ARIMA models ( without background )

10. Review of topics ( without background ) [with typos fixed]

Assignments
In general, you ought to read over all of the exercises at the end of chapters. I will pick out a few that seem most relevant, but that does not mean that you should ignore the others.

You have a week to submit what you've done. I don't expect everyone to complete them all, but you should show evidence that you've tried to do them. You also need not use JMP for the necessary computing, but you will need to have access to some sort of software because some questions call for doing a bit of computing.

Solutions will be posted on the day that assignments are returned, and then removed from the web page. As discussed in class, these exercises contribute 40% of your grade in the course.

1. Due ... after Spring Break
Remind/teach yourself how to use JMP! Most of you will have seen this in Stat 101/102.

2. Due ... Thursday, March 26 (at class) From the Bowerman textbook (Data is in Table 3-8 and Table 3-12):
3.9 (reproduce the scatterplot and fitted line using JMP)
3.10, 3.11,
3.18 (show JMP output)
3.22 a-d (let JMP do messy calculations)
3.30 (confirm results),
3.35, 3.36
Solutions

3. Due ... Thursday, April 2 (at class) From the Bowerman textbook (Data is in Table 4-11, Table 4-16, and Table 4-18):
4.2 (include the scatterplot matrix showing 'y' in the top row)
4.4 (parts a,b only; show the JMP summary of your multiple regression)
4.6 (parts c,g), 4.8 (part b only), 4.10
4.19
4.20 (parts a,b: Fit model with JMP and interpret coefficient of dummy variable)
4.20 (c,d)
4.22 (Test the null hypothesis that claims that the coefficient of both dummy variables is zero. Hint: don't use a t-statistic. You do not need to prepare answers for a-c in the text.)
Solutions

4. Due ... Thursday, April 9 (at class) Do all parts of the exercises unless indicated otherwise. From the Bowerman textbook (Data is from Table 3.2 QHIC , Table 5.5 Hospital , Table 6.6 Lumber , and Table 6.9 Energy ):
[5.13] Fit the indicated model using JMP; show a summary of your fit. Then for part b, find the 95% prediction interval for a \$250,000 home (not 220,000 as in text).
[5.16] (parts a,b only) Show the fitted JMP regression summary, and answer questions posed in parts a,b of the text.
[6.1] For b, show the calculation of the prediction interval. Do you think that this interval "should" be the same for all forecast periods? Explain why or why not.
For c, construct a scatterplot that shows the presence or absence of autocorrelation. You do not need to find the DW statistic.
[6.4] For part c.2, report the appropriate test statistics. For part c.4, use JMP to obtain the prediction intervals and compare these to the "naive" intervals formed as prediction +- 2 RMSE. For part d, only do 1-3 and use the shown output.
 This exercise does not come from the book. These data give the number of international airline passengers (in thousands), monthly from around the end of WWII through a period of rapid expansion (1949-1960). Notice that the last 12 rows (1960) are excluded and hidden.
1. Plot the data over time. What type of model appears appropriate?
2. Fit a regression model to be used to predict airline passengers in 1960. You may, and probably should, try several models; only report the one you decide to use. (No fair peeking at the held out data to pick the model.) Show a summary of your model and indicate whether (accepting the multiple regression model)
1. the overall model is statistically significant, and
2. components of the model are statistically significant.
3. Show that your model reasonably satisfies the conditions of the multiple regression model by checking for
1. Autocorrelation (include the Durbin-Watson statistic),
2. Equal error variance, and
3. Normality.
Show the appropriate plot with each!
4. Predict monthly passenger traffic in 1960. Compare the 12 95% prediction intervals to the actual data. Summarize how well the predictions of your model perform. Your answer should include a table with 5 columms: date, actual value, prediction, lower limit, upper limit.
5. What does your model predict for the *total* passenger miles for all of 1960? Give a prediction along with an *approximate* 95% interval. Compare your prediction and interval to the actual total. If you don't think you can get an approximate interval, then explain why not. (You ought to be able to get a prediction of the total, however.)
Solutions

5. No assignment for this week. Enjoy the holiday.

6. Due ... Thursday, April 23 (at class) From the Bowerman textbook (Data is in Table 9-7, Table 9-9, Table 9-10, and Table 8-1 ):
9.6 Answer text questions, but refer to JMP output that you generate.
10.1-10.6 Answer text questions, but use JMP output that you generate. (JMP may not give the same estimates as shown in the text!)
10.11, 10.12 Use the text figures or JMP to prepare answers.
10.13 Part (a) is more important. For (b), think about the differences between ARIMA models and exponential smoothing. Nothing too deep here, but give it some thought.
Solutions

Data sets
These data sets are in JMP format. If you want to use something other than JMP for the computing, simply open the file in JMP, then use the menu command File > Export ... to save the data file in, for example, a CSV file.