Class 1 Stat 604 Fall 1997

Introduction
The three pre-term stat courses
Stat 603 - 11 classes
Stat 604 - 6 classes. An accelerated 603
Stat 608 - 6 intense classes. Waiver exam preparation.
Equivalent to Stat 621
My assumptions about this class (you!)

Objectives of Stat604
Review intro stat ideas
Change emphasis toward interpretation and practical application
Learn the software
Get ready for Stat 621
Enjoy it

Quick review of the syllabus
Course material
Grading/assessment
TA's and office hours
Evaluations
Computing
The role of questions in class
Good questions
Insights
Clarifications
Tie backs/big picture
Bad questions
Missed last class
Flex muscles

Metaphor
the spoken language, not the grammar.
Course overviews

603/4
Understanding/measuring variability. Why it is important. Factor in variability/uncertainty to the decision making process
Who cares? The Basel Accord
Risky investments need higher reserves
Need to measure risk. e.g. J.P.Morgan
Risk == volatility of returns
Volatility == variability

Stat 621
Regression/statistical modeling/forecasting/explaining
variability
Models
Stock market
Market share
Real estate prices
What's different? Our model explicitly incorporate variability; don't just get to model the process, get to say how good the model is. Value added - the idea of precision.

What to get out of the course
Using software in internship - the pain is worth it
Perform statistical analysis - hands on
No stats background - not math based
Big project, THE learning experience
PRACTICAL APPLIED MODERN STATISTICS

Success in the course
Critical evaluation of an analysis
Mastery of stat package
Confidence to perform analysis/use tools
Presentation and communication of results
Todays material.
- Key concept: Summarizing data
- Key tool: the Empirical rule
- Key graphics: Histogram, boxplot and normal quantile
Basic statistical graphics and summaries.
Graphics
Box plot | Identification of outliers. |
Histogram | Shape of data. skewness. Outliers. |
Normal quantile plot | diagnostic for normality |
Summary measures
| CENTER | SPREAD |
Sensitive to outliers | Mean | Variance/SD |
Robust | Median | IQR |
Definitions and notation.
Mean = average. True
, estimated
Median = order the data, the one in the middle. No standard notation.
Variance = average squared distance from the mean. True
,
estimated
S.D. =
. True
, estimated s
IQR = 75 pctile - 25 pctile. No standard notation.
Shapes of distributions/histograms.
Symmetric bell shaped; mean
median
Right skew; mean greater than median
Left skew; mean smaller than median
- Symmetric bell shaped - good news.
- Skewness - watch out!
If data bell shaped and symmetric then say approximately normal.
Key: the mean and standard deviation summarize the data efficiently in
these circumstances.
The EMPIRICAL RULE rule applies when data is approximately normal.
Rule of thumb for normal data - it ties together the mean and standard
deviation, (
and
) into a rule that establishes where most
of the data should lie. If the data is outside this range then it's an
``atypical'' observation; in J.P. Morgan's terminology an adverse market move.
Review
Summary measures
Robust vs. Sensitive
Empirical rule for mound shaped and symmetric data
Ties together mean and s.d. to help define an ``unusual event''
Disparate data may be approx normal, ie GMAT and GM
But not ALL data is normal, ie Eisner's compensation.
Richard Waterman
Mon Aug 4 21:18:00 EDT 1997