Class 1 Stat 604 Fall 1997
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
Introduction
The three pre-term stat courses
Stat 603 - 11 classes
Stat 604 - 6 classes. An accelerated 603
Stat 608 - 6 intense classes. Waiver exam preparation.
Equivalent to Stat 621
My assumptions about this class (you!)
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
Objectives of Stat604
Review intro stat ideas
Change emphasis toward interpretation and practical application
Learn the software
Get ready for Stat 621
Enjoy it
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
Quick review of the syllabus
Course material
Grading/assessment
TA's and office hours
Evaluations
Computing
The role of questions in class
Good questions
Insights
Clarifications
Tie backs/big picture
Bad questions
Missed last class
Flex muscles
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
Metaphor
the spoken language, not the grammar.
Course overviews
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
603/4
Understanding/measuring variability. Why it is important. Factor in variability/uncertainty to the decision making process
Who cares? The Basel Accord
Risky investments need higher reserves
Need to measure risk. e.g. J.P.Morgan
Risk == volatility of returns
Volatility == variability
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
Stat 621
Regression/statistical modeling/forecasting/explaining
variability
Models
Stock market
Market share
Real estate prices
What's different? Our model explicitly incorporate variability; don't just get to model the process, get to say how good the model is. Value added - the idea of precision.
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
What to get out of the course
Using software in internship - the pain is worth it
Perform statistical analysis - hands on
No stats background - not math based
Big project, THE learning experience
PRACTICAL APPLIED MODERN STATISTICS
![*](http://www-stat.wharton.upenn.edu/~waterman/icons/bluepin.gif)
Success in the course
Critical evaluation of an analysis
Mastery of stat package
Confidence to perform analysis/use tools
Presentation and communication of results
Todays material.
- Key concept: Summarizing data
- Key tool: the Empirical rule
- Key graphics: Histogram, boxplot and normal quantile
Basic statistical graphics and summaries.
Graphics
Box plot | Identification of outliers. |
Histogram | Shape of data. skewness. Outliers. |
Normal quantile plot | diagnostic for normality |
Summary measures
| CENTER | SPREAD |
Sensitive to outliers | Mean | Variance/SD |
Robust | Median | IQR |
Definitions and notation.
Mean = average. True
, estimated
Median = order the data, the one in the middle. No standard notation.
Variance = average squared distance from the mean. True
,
estimated
S.D. =
. True
, estimated s
IQR = 75 pctile - 25 pctile. No standard notation.
Shapes of distributions/histograms.
Symmetric bell shaped; mean
median
Right skew; mean greater than median
Left skew; mean smaller than median
- Symmetric bell shaped - good news.
- Skewness - watch out!
If data bell shaped and symmetric then say approximately normal.
Key: the mean and standard deviation summarize the data efficiently in
these circumstances.
The EMPIRICAL RULE rule applies when data is approximately normal.
Rule of thumb for normal data - it ties together the mean and standard
deviation, (
and
) into a rule that establishes where most
of the data should lie. If the data is outside this range then it's an
``atypical'' observation; in J.P. Morgan's terminology an adverse market move.
Review
Summary measures
Robust vs. Sensitive
Empirical rule for mound shaped and symmetric data
Ties together mean and s.d. to help define an ``unusual event''
Disparate data may be approx normal, ie GMAT and GM
But not ALL data is normal, ie Eisner's compensation.
Richard Waterman
Mon Aug 4 21:18:00 EDT 1997