Take note: This course has only one goal -- prepare 1st year statistics Ph.D. students for research.
This is not an applied statistics course.
General honor code: You may discuss the problems with each
other in general terms, but you must write your own solution. All
sources, including friends and colleagues, must be cited. It is
important to get used to a stringent code of conduct in scientific
writing. On the other hand, use commonsense and attribute where
honesty requires it. Two points worth special mention:
*** If you received an extension for a homework, do not consult
posted solutions.
*** A grave offense would be consulting solutions of homeworks from
previous years.
Unless instructed otherwise, homeworks should be e-mailed
in attachments to stat541.at.wharton[at-sign]gmail.com.
The format should be .txt or .pdf or .doc depending on
the assignment.
Your checked and graded solutions are returned in e-mail attachments.
Search '#PW' (Pengyuan Wang, TA) to find comments.
A score such as 8/10 at the end means '8 out of 10 points'.
A deduction of 2 points does not mean you got two questions wrong; it is
only a relative measure of how much below optimal your solutions are.
IMPORTANT: If a function in the class notes does not work or is not found in your R session, check whether the function is in one of the R code files below. If so, download and read the file into R one more time, even if you thought you had done so earlier. I allow myself to update the code all the time.
TA: Pengyuan Wang
Email: stat541.at.wharton[at-sign]gmail.com (urgent: buja.at.wharton[at-sign]gmail.com)
Office hours: Monday after class 4:30pm, and by appointment.
Office: JMHH 471
Class Room: JMHH G86
Email: pengyuan@wharton.upenn.edu
Office hours: Tuesday 9-11m, and by appointment
Accordingly, the demands on conceptual thinking and quick uptake
will be considerable as the course progresses.
What this course is not: not an applied statistic course,
not a R course, and not a service course to other departments.
For a graduate level applied statistic course, see Paul
Rosenbaum's 500 level statistical methods.
The weekly homeworks will be the heart of what you retain from
this course.
There will be no midterm or final exams.
Note: This is not an R class. R will not even
be taught in light of the computational literacy of this year's
statistics Ph.D. students.
basis changes and associated coordinate changes, linear maps,
inner product spaces, orthogonal projections, eigen
decompositions
thinking in random variables, limit theorems
statistical tests, confidence intervals, linear models,
estimation, sufficiency
Examples of high-level languages: R, Splus, Matlab, Perl,
Python, Visual Basic
Examples of low-level languages: C, C++, Fortran, Java
Outright detrimental for learning R is exclusive knowledge
of SAS due to its very different computational model.
This is the above required book in a free online version. While
online is handy for searching, you should still have the book
version for actual reading. It costs only about $20.
As we go along, special topics books will be recommended.
Strangely, the most fundamental material is no longer in the recent edition:
"Linear Transformations, Matrices, and Change of Basis."
In older editions this used to be tucked away in the appendix.
For this reason, the material is now included in Homework 2:
You get to derive it yourself by following instructions.
More recent developments can be followed at the TED talks, several
by Rosling,
another by McCandless.