*The three pre-term stat courses
*Stat 603 - 11 classes
*Stat 604 - 6 classes. An accelerated 603
*Stat 608 - 6 intense classes. Waiver exam preparation. Equivalent to Stat 621

*My assumptions about this class (you!)

Objectives of Stat604

*Review intro stat ideas
*Change emphasis toward interpretation and practical application
*Learn the software
*Get ready for Stat 621
*Enjoy it


Quick review of the syllabus

*Course material
*TA's and office hours
*The role of questions in class
*Good questions
*Tie backs/big picture

*Bad questions
*Missed last class
*Flex muscles



the spoken language, not the grammar.

Course overviews



Understanding/measuring variability. Why it is important. Factor in variability/uncertainty to the decision making process
*Who cares? The Basel Accord
*Risky investments need higher reserves
*Need to measure risk. e.g. J.P.Morgan
*Risk == volatility of returns
*Volatility == variability


Stat 621

Regression/statistical modeling/forecasting/explaining variability
*Stock market
*Market share
*Real estate prices

*What's different? Our model explicitly incorporate variability; don't just get to model the process, get to say how good the model is. Value added - the idea of precision.

What to get out of the course

*Using software in internship - the pain is worth it
*Perform statistical analysis - hands on
*No stats background - not math based
*Big project, THE learning experience


Success in the course

*Critical evaluation of an analysis
*Mastery of stat package
*Confidence to perform analysis/use tools
*Presentation and communication of results

Todays material.

Basic statistical graphics and summaries.


Box plot Identification of outliers.
Histogram Shape of data. skewness. Outliers.
Normal quantile plot diagnostic for normality

Summary measures

Sensitive to outliers Mean Variance/SD
Robust MedianIQR

Definitions and notation.

*Mean = average. True tex2html_wrap_inline146 , estimated tex2html_wrap_inline148
*Median = order the data, the one in the middle. No standard notation.
*Variance = average squared distance from the mean. True tex2html_wrap_inline150 , estimated tex2html_wrap_inline152
*S.D. = tex2html_wrap_inline154 . True tex2html_wrap_inline156 , estimated s
*IQR = 75 pctile - 25 pctile. No standard notation.

Shapes of distributions/histograms.

*Symmetric bell shaped; mean tex2html_wrap_inline160 median
*Right skew; mean greater than median
*Left skew; mean smaller than median

If data bell shaped and symmetric then say approximately normal.

Key: the mean and standard deviation summarize the data efficiently in these circumstances.

The EMPIRICAL RULE rule applies when data is approximately normal.

Rule of thumb for normal data - it ties together the mean and standard deviation, ( tex2html_wrap_inline146 and tex2html_wrap_inline156 ) into a rule that establishes where most of the data should lie. If the data is outside this range then it's an ``atypical'' observation; in J.P. Morgan's terminology an adverse market move.


*Summary measures
*Robust vs. Sensitive
*Empirical rule for mound shaped and symmetric data
*Ties together mean and s.d. to help define an ``unusual event''
*Disparate data may be approx normal, ie GMAT and GM
*But not ALL data is normal, ie Eisner's compensation.

