Stat 900: Advanced Probability (J. Michael Steele)

Short Stories? Well, in a way ...

This page is a page of reminders of some pleasing things to know about that may not be in the active memory of the typical statistics/mathematics/engineering graduate student. In addition, these things are self-contained --- and easily explained in as few as five minutes. From time to time, I'll pluck an item from the list if we have a spare moment.

The Schroder-Bernstein Theorem

This theorem asserts that if there is an injection from the set A to the set B and an injection from the set B to the set A, then there is a bijection between the sets. Certainly this is trivial for finite sets, but one might even fear that it could be false for very large sets. I could have been quite stuck trying to prove this, but I recently learned an easy proof. Despite big differences, there are still some analogies between the SB theorem and the the Marriage Lemma (or bi-partite matching theorem).

An Existence Trick

Suppose f is a differentiable function from the reals to the reals such that f(x)/|x| goes to infinity as |x| goes to infinity. Prove that f' is a surjection. That is, show that for all y, there is an x depending on y such that f'(x)=y.

This may seem a little mysterious, but examples like x^2 start to demystify it. Do we have ANY tools to help with the proof? Think about the Fermat priciple!

Sum of Squares Tools

There is a matlab tool that will take a polynomial (say, for example, one that you suspect is positive for all x, y, z ...), and then write it as a sum of squares of polynomials (if it can). This is called a SOS representation, and it is useful to know there are algorithms for such things. Examples of applications and extension can be found in these talk slides.

The "Transformation Method" for Proving Inequalities

One pleasing way to prove an inequality is to exhibit a transformation that preserves one side and that transforms the other side in systematic, understandable way. We can then hope to prove our inequality by understanding the dynamics of the transformation. This is very simply illustrated by the AMGM inequality, but the same philosophy can be used to prove more sophisticated inequalities such as the isoperimetric inequality or Polya's eigenvalue inequality.

Incidentally, the proof of the AMGM inequality by the transformation method is vaguely analogous to the simplex method --- at least in so far as the transformation increases an object function at each step.

The very rearrangement inequality is nicely proved by the transformation method. It is the next logical step after the AMGM inequality.

Gordan Alternative via the Approximate Fermat's Principle

Gordon's theorem tells us that if we have as set S of N vectors, then exactly one of the two alternatives holds:

There is a vector x such x has a strictly negative inner product with each v in S, or
There exists a probability distribution on S with expected value zero.

This can be proved using a separation argument or by Farkas's Lemma, but it is also interesting to give a "variational proof". In particular, we can get it by applying the approximate Fermat principle to a version of the "soft max" function --- which has a remarkably useful derivative.

Birkhoff's "Majorization Representation Theorem"

There is a nice exposition of this on the web. If you already know a bit about majorization, you'll see that Birkhoff's theorem is quite a powerful tool. This exposition uses the separation theorem in a way that is natural and quick, but, just for for fun, you might see if you can do the job using the approximate Fermat principle. This is also related to Muirhead's inequality for alpha-means.

Stat 900: Advanced Probability at Wharton