JOURNALOF

COMPUTATIONAL AND

Nk APPLIED MATHEMATICS

ELSEVIER Journal of Computational and Applied Mathematics 142 (2002) 235249

www.elsevier.com/locate/cam

The BohnenblustSpitzer algorithm and its applications

J. Michael Steele

Department of Statistics, Wharton School, University of Pennsylvania, Steinberg HallDietrich Hall 3000,

Philadelphia, PA 19104, USA

Received 14 September 2000; received in revised form 18 January 2001

Abstract

The familiar bijections between the representations of permutations as words and as products of cycles have a natural class of "data driven" extensions that permit us to use purely combinatorial means to obtain precise probabilistic information about the geometry of random walks. In particular, we show that the algorithmic bijection of Bohnenblust and Spitzer can be used to obtain means, variances, and concentration inequalities for several random variables associated with a random walk including the number of vertices and length of the convex minorant, concave majorant, and convex hull. @ 2002 Elsevier Science BN. All rights reserved.

MSC.. primary: 601305; secondary: 60C05; 68U05; 62G99

Keywords: Spitzer's combi natorial lemma; Random walk; Convex hull; Permutations; Cycle decomposition; Cycle lemma

1. Introduction

The most direct way to represent a permutation a of {1,2,...,n} is by the permutation word (al, U2,..., Un)were Uk is simply the image of k under the mapping a: [n] ~ [n]. Nevertheless, there are many other ways to represent a permutation, and there are also many different bijections between the structures that provide these representations. The purpose of this article is to show how the geometry of a random walk can be made tractable by exploiting a special bijection between the set of word representations and the set of representations as products of cycles. Themost novel feature of this bijection is that it is "data drive&' in a sense that will be soon be made explicit. The benefit of this class of bijections is that it permits us to establish identities that tell us about more than just the structure of the set of permutations. In particular, we will see how these data driven bijections can be used to obtain geometrical information about the convex minorant and concave majorant that make up the bottom and top of the convex hull of the graph of a random walk.

Email address: steele@wharton.upenn.edu (LM. Steele).

03770427/024 see front matter @ 2002 Elsevier Science B.V. All rights reserved.

PIL S03770427(01)004721

236 J.M. Steele1Journal of Computational and Applied Mathematics 142 (2002) 235249

At the heart of our analysis is an algorithm that appears implicitly in the proof due to Bohnenblust of a combinatorial lemma due to Spitzer. Spitzer's combinatorial lemma has been widely used by probabilists, but probabilists seem to have ignored the fact that the algorithm actually gives us considerably more information than the particular lemma that it was designed to prove.

Before developing the BohnenblustSpitzer algorithm, we need to collect a few facts about the cycle structure of a random permutation. The required facts are wellknown to experts, but the approach to these facts via record times is so simple and so powerful that it deserves to be known by everyone. We take up the BohnenblustSpitzer algorithm and suggest that one simple picture goes a long way toward making "everything obvious".

Next, we turn our attention to the applications of the. BohnenblustSpitzer algorithm, and, in short order, we will collect a harvest of concrete results on the geometry of random walk. Finally, we point out some limitations of the BohnenblustSpitzer algorithm and mention several attractive open problems.

2. Standard cyclical representations and random permutations

Any permutation r has a representation as a product of cyclic permutations

T = (a,a2 ... ai,)(ai,+lai,+2 ... ai2).. * (ai,,+lai,1+2 ... ajJ,

but this representation is far from unique unless we impose some further restrictions. First, a cyclical permutation of any block does not change the represented permutation, so we can always take the first element of each block to be the largest element of the block. Also, the order in which the blocks are written does not change the permutation, so we can also take the blocks in that order that puts the leading elements into an increasing sequence. Cyclic representation that satisfy these two conditions are said to be standard, and the notion of a standard representation as a product of cyclical permutation leads us very quickly to a detailed understanding of the number v = v(T) of cycles in a random permutation r.

The key combinatorial observation is that if we are given the word representation of a random permutation a, then there is a simple way to associate u with the standard product cycle representation of another permutation r (which typically differs from a). Here is how the recipe goes. If U = (G1, U2,..., UJ, we say that Uk is a record provided that uk > max{aj: j < k}, and, if we break the word (al, a2,..., UJ into blocks so that each new record starts a new block, then it is immediate that the resulting set of ordered blocks falfills all of the requirements of a standard cyclical representation.

Next, there is a useful probabilistic observation. If we let ~k be equal to one or zero accordingly as Uk is or is not a record of the random permutation word a, then by direct calculation, one can show that the random variables {~k: 1 < k <'n} are independent and P(~k = 1) = 11k. To see why the last equation holds, one just needs to note that of the k! possible orderings of {a,, u2,..., Uk} in the permutation words that begin with such elements, there are exactly (k 1)! for which the largest element occurs at the kth place. An analogous (but slightly longer) argument also establishes the independence of the set of random variables {~k: 1 < k < n}.

We therefore find that the number Of Vn = Vn(T) of cycles in a random permutation c of [n] has the same distribution as the sum of n independent random variables ~1 + ~2 + ''' + ~n, and such

J M. Steele /Journal of Computational and Applied Mathematics 142 (2002) 235249 237

sums are very well understood. In particular, the mean and variance of v, have simple expressions in terms of the harmonic numbers Hn,

n n

E[v,,l = 1: ' ~e=fH, and Var[vj = 1: 1 1 Hn + 0( 1

k=l k=l k ( k)

and the FellerLindeberg Central Limit Theorem can be applied to obtain the asymptotic distribution of V,,

lim P ((v,, Hn )/ ,1H_n < t) eX'/2 dx. (2)

n~c~o 72=. L

Finally, the characterization of v, as a sum of independent random variables can be used to obtain concentration results (or large deviation inequalities) for v,. These will be given a bit later when they are needed.

3. BohnenblustSpitzer algorithm

Now we are ready to develop the BohnenblustSpitzer algorithm and to see how it compares with the basic recordtime bijection. The data that drives the algorithm is simply a sequence of n real numbers xl,x2,...,x, that we assume to be linearly independent over Z; that is, we assume that whenever we have aix, +a2X2 ++a,x, =0 for some a= (al,a2,...,a,) E Zn then we must have a = 0.

Here is the algorithm. We take any permutation a with word representation (ul' U2,, Un), and we first form the partial SUMS Sk(U) given by

Sk (LT) =X,, + X12 + + X1k for 1

Next, we consider the set of n points Pk = (k, Sk(U)) ER 2 , and we compute the least concave majorant M(a) of these points. Since these points are automatically given to us in order of increasing xcoordinate, the convex hull algorithm of Graham and Yao [ 10], tells us that one can compute M(u) in linear time. At this point one should consider Fig. 1 which gives a simple example of such a majorant.

The last step of the algorithm is to break the word representation for a into blocks that correspond to the faces of M(u). Specifically, we let il, i2,..., i,, denote the set of indices at which M(u) makes a turn, and, to close out the list, we set i, = n. Finally, we define the permutation v to be the product of cycles given by

(U1, U2 ~ 17iI X174+1 5174+2~ ui, (ai,,+1~ 17i,i+25 ~ Ori,).

This representation for c completes the algorithm for the mapping a ~* r, but there are two important geometric features of this representation that one should note.

First, by the concavity of the majorant, the slopes of the successive faces form a decreasing sequence, and in terms of the sequence {xk} this tells us that

1 1 . 2 v

X, > X, > >

E X,

ii j=1 i2 il j=ii+t v j=41+1

238 J. M Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249

7

6

5

4

3

2

2 4 6 8 10

Fig. 1. The concave majorant guides us to the cycle cuts.

Second, from the definition of the concave majorant, we see that if we set io 1, then for all 0 < k < v and all ik < j < ik+1 the points

U, SAU)) = U, x,~ + X" + . . . + XIIi )

all lie below the line from the point (ik, sik (a)) to the point (ik+ 1, sik,, (a)). We now just need to check that these two properties are enough to guarantee that the mapping a 1~ r is an honest bijection.

4. Why does it work?

We need to show that given any permutation T and any representation of r as a product of cycles, there is a unique a whose word representation is mapped to r by the BohnenblustSpitzer algorithm. An important element of this inversion process is an elementary "cycle lemma". The surveys of Dershowitz and Zaks [71 and Snevily and West [17] show that such lemmas come in many flavors and have many different combinatorial consequences. Here we only need a simple geometric version.

Lemma 1 (Cycle lemma). if {XI,X2,...,Xn} is a sequence of n real numbers that are linearly independent over Z, then there is a unique k such that the sequence yl,y2,...,y, given by

Y1 =Xk, Y2 = Xk+ 1, ..., Ynk = Xn, Ynk+1 =Xl, ..., Yn =Xk1

has the property that for each 1 <, j < n the point U, Y1 + Y2 + + yj) E R2 lies above the line from (0, 0) to (n, y, + Y2 + ' .. + Yn).

The proof of this lemma is almost immediate if we make the right normalization and if we draw the right picture. For the normalization, we simply let s denote the sum of the {Xk} and set Zk = Xk SIn for 1 < k n. The sum of the zk is then equal to zero, and a typical graph of the partial SUMS ZI + Z2 + ' + Zk for 0 < k < n is given in Fig. 2

Now, by the linear independence of the {xj}, there is a unique minimum in the set of partial sums of the {zj}. Also, from the figure it is evident that if we take k to be the xcoordinate of that minimum, then the incremental partial SUMS Zk+ 1, Zk+ 1 + Zk+2,. .. on up to Zk+ 1 + Zk+2 + ' ' ' +

J M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249 239

1243 678910

Fig. 2. The minimum partial sum of the {zj} tells us where to start our cycle to get a path that stays positive.

Zn + ZI + Z2 + ... + zkl are all strictly positive. Finally, if we undo our normalization, we see that the positivity of these partial sums and the uniqueness of k are precisely the assertions of the lemma.

Now suppose we have a permutation c, and a representation of r as a product of cyclic pennutations,

z = (a,a2 ... ai,)(ai,+lai,+2 ... ai2). (ai,_,+lai,_,+2 ... ai,

Witho ut changing the permutation r we can rearrange the blocks of the representation so that the average value of the {xi} over the blocks is in decreasing order; in other words, we can assume without loss of generality that we have

1 1 . 2

Xa, > E Xaj > > Xaj. (3)

11 j=1 i2 il j=ii+l

Next, by the cycle lemma (or rather its "below the line" version), we note that by a rotation within each of the blocks of T, we can assume that for all 0 < k < v and all ik + 1 S < ik+1 then we have that

ik, s X., lies below the line from (0, 0) to ik+l 4, ik+ 1 Xaj (4)

E 1:

(S j=ik+l j=ik+l

The linear independence of the {xi} and the uniqueness assertion of the cycle lemma are all that one needs in order to check that the product cycle representation for r is uniquely specified when the block sums decrease according to the inequalities (3) and when the majorization property (4) holds for each of the blocks. Finally, the uniqueness of this representation for c makes it immediate that the permutation u with word representation (a,a2 ... a,) is the unique permutation that is mapped to ,c by the BohnenblustSpitzer algorithm.

These observations tell us that BohnenblustSpitzer algorithm does indeed provide a bijection of the set of permutations, but this just begins the story. The real task is to see how this bijection fits together with data set {xi} that drives it.

240 J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249

5. Interpreting the picture

The power of the BohnenblustSpitzer bijection comes from the fact that many of the geometrical

features of Fig. 1 have two interpretations. Such features can be expressed equally well in terms of

the word representation for a, or in terms of the product cycle representation of r. For example, if

we write C Er to indicate that C is a cycle of c, then the length of the concave majorant of the

point set B, = {(0, 0), (1, si (n, s.(a))} can be written as

1: (IC12 + (1: Xi)2)112 (5)

CEz iEC

and, in fact, almost all of the geometric quantities in Fig. 1 can be written in the general form

Z,C(T) = E f 1 cl, E xi (6)

CET ( iEC )

Specifically, if we take f(k, y) =_ 1, then Zf (T) is equal to the number of faces in the concave majorant of B, and if we take f(k, y) = yIk then Zf (T) is equal to the sum of their slopes.

One of the most interesting choices of fand the only one considered by Spitzer [19]is simply f (k, y) = max(O, y). For this choice, the concavity of the majorant implies that Zf (T) simply equals the sum of the rises in the majorant until it reaches its maximum. In other words, we have

max(O,sl(u),...,s,(u))=Y, (Y~xi + (7)

CET iEC )

where y+ is used as shorthand for max(O,y). This marvelous identity of Spitzer [19] has many important consequences, and it has inspired much subsequent work. Nevertheless, the preceding discussion should make clear that this identity contains just part of the geometric information that one can draw from the BohnenblustSpitzer bijection.

6. Computing expectations

To see how one can exploit formulas like (7) and its geometric cousin (5), it is a good exercise to

show that for any sequence of independent identically distributed random variables {Xi: 1 < i < n},

the partial sums Sk = X, + X2 + .. + Xk satisfy the identity,

n ' 1 +].

E[max(O, S,, S2.... 1 SnA = Y , ~E[Sk (8)

k=l

The critical trick that brings the BolmenblustSpitzer bijection into play is simply prerandomization of the order of the summands, an idea that is familiar in the theory of algorithms as a way to guarantee the expected running time of QuickSort.

There is also one small technical issue needed in the proof of the expectation identity (8). Since the BohnenblustSpitzer bijection requires that the {xi} be independent over Z, we will first prove

J. M. Steele I journal of Computational and Applied Mathematics 142 (2002) 235249 241

the identity under the assumption that the random variables {Xi} have a density. This assumption is more than enough to guarantee that with probability one, the set of real numbers {Xi(co)} will form an independent set over Z.

Now, if we let & denote the set of all permutations of [n], then the fact that the random vectors (X115XG125 * * *,XJ have the same distribution for all a E & tells us that we can write E[max(O, S,, S2.... SJ] as

E max(0,X,,X,, + XT2 3 ... ~X(TI +XL72 + +X9J 1

n! 1 Y 1

aES"

so the BohnenblustSpitzer bijection and the combinatorial identity (7) tell us that the last expectation is also equal to

1 E xi

n!

EESn k=l cET,1C1 k (iEC

Since there are exactly (n k)! (k 1 n ) permutations that contain the cycle C when C k, the k

sum over the permutations may be further simplified to

n ( k X n

1 k)!(k 1)! (n) E E[Sk+1.

n, (n

. k=l k=l

This identity completes the proof of the target formula (8) in the case when the {Xi} have a density. Its validity. for independent random variables {Xi} with an arbitrary distribution then follows by standard approximation arguments.

Variations on this theme can be created almost at will, yet we content ourselves with just two further examples, the length Ln of the concave majorant and the sum An of the slopes of its faces. Although the function f(k, y) = ,/k2+. y2 that represents the length functional in the form (6) depends on both y and k, this function is no harder to handle than the univariate function f (y) = y+ that corresponds to max(O, S,, S2,, SJ. The additional parameter does not interfere with our earlier computation, and one finds without difficulty that

n

E[LA .!E [~k2 2]

k + Sk (9)

k=l

Now, without some additional distributional assumptions, this expectation cannot be significantly simplified, but, if we assume that the {Xi} satisfy E[IXil] < oe and E[Xi] =,u, then we can get some useful asymptotic information. If we take k inside the radical and use the strong law of large numbers and the uniform integrability of {SkIk}, we easily find

E[LAnV'7_+M2 asnoc).

This interesting geometric formula tells us that the expected length of the concave majorant grows exactly like the length of the line from (0, 0) to the point (0, E[SJ) = (n, ng).

242 J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249

The slope functional also has an informative interpretation in the limit. Here we need to assume that the {Xi} have a density as well as a first moment M, but, under these conditions, the general representation (6) and the now familiar expectation calculation tell us that we also have

n

E[A,j=Y, E[Sklk]=IiH,,plogn. (10)

k=l

In the next section we will find that the expected number of faces of the concave majorant exactly equals the harmonic number H,, so, in an average sense, the slope formula (10) tells us that the average slope of the concave majorant is M. One can view this result as a very weak form of the law of large numbers for the slope sizes.

7. Looking harder at the faces

The methods of the last section can be used to calculate the expectation of the number F, of faces of the concave majorant, and the simple choice of f(k, y) =_ 1 in the representation (6) quickly brings us to the nice distribution free formula

n

E[F.1=E =H,,logn.

k=l

Nevertheless, in this case there is an alternative derivation of the formula (11) that is much more informative. In particular, this alternative approach will give us the exact variance and asymptotic distribution of F,; results that are far out of reach for Ln and A,

The key observation is that one can show that for independent random variables {Xk} with a density, the number of faces F, of the concave majorant of the random walk will have exactly the same distribution as R, the number of record times of the {Xk}. Since we saw in Section 2 that R, is equal in distribution to a sum of independent Bernoulli random variables, the link between R, and F, gives us a powerful tool for the analysis of F, though we will find that there are some instructive limitations to the inferences one can draw from this link.

If we let F(x,, 1 X121 .... Xj denote the number of faces of the concave ma orant of the set of points

B, = {(0, 0), (n,s,,(u))}, where Sk(u) denotes the partial SUM X11 + X12 + + x,, for

1 < k < n and where the real numbers {Xk} are independent over Z, then the BohnenblustSpitzer

bijection tells us that for all a E S, we have

F(x,, 1 XU2 ~Xa,,) = E (12)

CE.t

This formula and the onetoone correspondence between the u's and the r's combine to tell us that if a is chosen at random then F(x,.,,x,,,...,X,,,) is equal to the number of cycles in the random permutation r, and we already know from Section 2 that the number of cycles in a random permutation is equal in distribution to T, where

n

Tn ~k, PGk = 1) C PGk= 0) = 1 k (13)

k=l

J.M. Steele1Journal of Computational and Applied Mathematics 142 (2002) 235249 243

and the {~k: 1 < k < n} are independent. We would like to deduce from these facts that Fn = F(Xl,X2,,Xn) is equal in distribution to T,, and an honest derivation of this intuitive fact seems easiest with the use of characteristic functions.

From formula (12) and the known distribution of the number of cycles in a random permutation, we have for all 0 E R that

n! E exp(i0F(x,,, x,,,, xn)) = n ! E exp i0E 1 E[exp(ffiTn)]

t& Es. ( CCC )

and, in terms of the random variables {Xk: 1 < k <, n}, this tells us that with probability one we have

1 E exp(i0F(X,,(CO)IX'2(CO)"..,.X,,(cu))) =E[exp(i0T,)]. (14)

n! aE&

When we take the expectation on both sides of the pointwise identity (14) we find that E[exp(i0Fn)l = E[exp(ffiT,)], so F, and T, are indeed equal in distribution.

For each n, 1 < n < oo, the number of cycles v,, the number of records Rn, the number of faces F,, and the Bernoulli sum Tn all have the same distribution, so the meanvariance formulas (1) and central limit theorem (2) for Vn may seem to provide quick answers for all of the natural questions one might ask about the sequences {F,} and {R,}. Nevertheless, a curious gap emerges when we consider the strong law of large numbers.

8. Of processes and strong laws

The relationship between the number of records Rn and the Bernoulli sums Tn turns out to be much stronger than the relationship we have found between F, and Tn. In fact, when one looks hard at the argument of Section 2, or simply consults the original article of R6nyi [15], then one finds that {R,: 1 < n < oo} and {Tn: 1 < n < w} are equivalent as processes; that is, all of the joint distributions of these random variables are equal. Among other things, this equivalence as processes tell us that the strong law of large numbers for Rn,

P ( lim R,/log n = 1 1,

n~oo

will follow from the corresponding law for the Bernoulli sum Tn. As R~nyi [15] observes, the strong law for the process {T,} is an immediate consequence of Kolmogorov's Three Series Theorem, so we do indeed have a strong law for the process {Rn}.

In the case of the number of faces F,, we know that for each n the random variable Fn has the same distribution as Tn, but the processes {Fn: 1 < n < oo} and {Tn: 1 < n < 00} certainly are not equivalent. To see this, we just need to note that T1 < T2 < Tn with probability one, but no such monotonicity relationship holds for the sequence {Fn}.

One consequence of the nonequivalence of {T,} and {Fn} is that the strong law for {Tn} does not imply a corresponding strong law for {Fn}. Nevertheless, one might hope to prove such a strong law by first obtaining good bounds on the tail probabilities P(T, >, t) and then exploiting the marginal equivalence P(Tn >, t) = P(Fn > t). We pursue this plan in the next section.

244 J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249

9. Tail probabilities for the number of faces

The classic inequality of Bennett [4, Eq. (8b)l tells us that if the independent random variables {Yk: 1 < k n} satisfy

n

E[Yk] 0, sup 1JYkjj. < a and I:E[Yk] <, b',

k k=l

then we have the tail bound

t _ [t + b'] log(, + at

p Yk > t <, exp a a Y2 for all t >, 0. (15)

k=l

If we let Yk = ~k Ilk where the ~k are the summands of Tn, then both Yk and Yk satisfy the

conditions of Bennett's inequality with

2 n 1

a=l and b =LH,,, k

k=l so the elementary inequality log(l + x) > x x 2 /2 for 0 < x < 1 tells us we can apply Bennett's d

inequality to Tn and T,, to deduce from FT, that F, satisfies

^ IFn Hn 1 > aH,, ) < 2 exp( _ 82Hn /4) for all 0 <, c < 12' (16)

This inequality offers a useful complement to the central limit theorem for {F,}, and it provides us with a quantitative version of the weak law of large numbers for {F,}. Nevertheless, the estimate (16) still falls short of the type of tail bound that might be combined with the BorelCantelli lemma to get a strong law for the ratios {F,/log n}.

To be sure, if we set N(s) = LS,1,2 j for s = 1, 2,. .., then the inequality ( 16) and the BorelCantelli lemma imply that

lim ji 1 FN(s) = 1 a.s. (17)

S~00 N(,9) and, if the sequence {F,} were monotone like {T,}, then the limit (17) would imply

lim 1 F, = 1 a.s. (18)

n ~oo H,

Unfortunately, as we noted earlier, the sequence {Fn} is not monotone increasing, and the issue of a strong law for the ratios {F,,llog n} remains open.

In fact, there is even room to doubt if such a law is true, despite the validity of the law for the close cousins {Tn} and {Rn}. When the increments have zero mean, so E[Xk] = 0 for = 1, 2,..., one source of doubt comes the almost sure recurrence of the random walk {Sn}, the infinite expected time between the recurrence times, and the curious behavior explained by the arcsin laws. Also, we will find in the next section that even when E[Xkl:7~O the process {F,} is likely to have recurring instances of large percentagewise decrease. Therefore, for the moment at least, the heuristic arguments for and against the strong law for {F,11ogn} are well balanced.

1

J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249 245

10. Face strides and the cycle structure

The last several sections provide answers to the most direct geometrical questions concerning the geometry of the concave majorant, but there is a further geometric quantity introduced by Suidan [21] where the BohnenblustSpitzer bijection can prove very useful. For each face of the concave majorant, we consider the length of the projection of that face on the xaxis, and we then consider these lengths in decreasing order

Al(n) >, A2(n) > ... > Ak(n) > ... .

To avoid conflicts with our standing terms, we will call the {Ak(n)} the strides of the concave majorant, and, in particular, Al(n) is the length of the longest stride. For a random walk for with increments {Xk: 1 < k < n} that have a density, the BohnenblustSpitzer algorithm then tells us that set of strides {Ak(n)} is exactly equal in distribution of the set of cycle lengths of a random permutation.

From the work of Lloyd and Shepp [161, Arratia and Tavare [11, and others, the probability theory of such cycle lengths is extremely well understood. To give one inference from among many that can be read ofr from known results, we just note that Eq. (14) of Lloyd and Shepp [16] gives us

lim 1 E[Al (n)l exp [_x e _Y dy] dx = 0.6243 ...

n~oo n Y

so, in the typical case, one single stride covers more than half of the observations. One consequence of the existence of such dominating strides is that the periodic emergence of a new one seems likely to cause a major decrease in Fn. In turn, this fact ofrers room for doubting the validity of a strong law for {FnIlog n}.

11. The convex hull

We are now ready to consider what is surely one of the most natural geometric features of the

graph of a random walkthe convex hull. If {Xk: 1 <, k < oo} is any sequence of independent

absolutely continuous random variables, we now let Fnhull denote the number of faces of the convex

hull of the random point set B = {(0, 0), (n,S,)} determined by the random walk Sk =Xl +

X2 + . .. + Xk. To complement this notation we can also let FntOP and FnbOttOffl, respectively, denote

the number of faces of the concave majorant and convex minorant of B. Since Fntop is just the

variable that we denoted earlier simply by F,, and since the distribution of F,, did not depend on the

distribution of the {Xk}, we now see that FtOp and Fnbl"n are equal in distribution. Oddly enough,

it seems hard to judge if this fact should be regarded as surprising or as obvious; there is support

for each perspective.

In any case, we have the trivial identity

Fhull = Fntop + Fnbottom (19)

and, as a consequence, we find the nice expectation formula

hu 1 (20)

E[Fn 11] = 21: k

k=l

1

246 J. M Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249

As it happens, this formula is a special case of a more general result of Baxter [31, and Baxter's proof even turns out to be simpler than the derivation we have given here for the distribution of Fn. Nevertheless, we will shortly see that the present methods yield many results that seem to escape Baxter's method.

Baxter's key observation was that Lemma 1 (the cycle lemma) may be generalized to genuinely twodimensional random walks without any real change in the proof. Again, the trick is to start one's cycle at the lowest point of the path. By building on this observation, Baxter was able to prove that if {Zk: 1 < k <, n} is an independent sequence of random vectors with values in R', then the expected value of the number of faces of the convex hull of the partial sums Z1 + Z2 +. .. + Zk, 1 < k < n is exactly equal to 2 E"=, Ilk provided that with probability one the set {Zk(co): 1 < k < n} is linearly k

independent over Z.

The reason for our somewhat labored statement of this result is that we want to stress that Baxter's theorem applies to the vectors Zk = (I,Xk) where the {Xk} are independent random variables with a density. It is for this choice that Baxter's expectation gives us the formula (20) for the convex hull of the graph of a onedimensional random walk. In this important case, the {Zk} are not absolutely continuous, but they are still almost surely linearly independent over Z.

Baxter's method is wellfocussed for the computation of expectations, and it can also be used to get information on the expected length of the convex hull, a problem first studied by Spitzer and Widom [20]. These authors based their proof on a geometric formula going back to Cauchy and on a formula given by Kac [12] whose proof Kac attributes to Dyson. Strangely enough, neither the method of Baxter nor that of Spitzer and Widom seems to be able to get information about the number of faces or length of the concave majorant. The ability to look at the convex hull problem cc one half at a time" seems to one of the ways that the BohnenblustSpitzer bijection gains power over the more direct applications of the cycle lemma.

A second drawback of Baxter's method is that it does not offer any information about the variances or any other measures of concentration, although such information sometimes can be obtained by more direct probabilistic means. For example, Snyder and Steele [18] used the representation of Spitzer and Widom [20] together with martingale methods to obtain variance estimates and concentration inequalities for the length of the hull of the twodimensional random walk.

,h,," is that it resists any such attack

One of the interesting features of the face count functional F, by the martingale bounded difference method. The problem is that if one changes one of the {Xi}

hull (X,, X2,..., Xj Nevertheless, our there is no control over the change that is made in FhUll =F"

understanding of F,'T and Fb,1101 can be used to get variance bounds and concentration inequalities. Since we have no information at present about the correlation between F,',0P and F,,b011, the best that one can deduce from the representation (19) is simply

hull] < 4 1 1 (1 1) < 4H,.

Var[F . E kk

k=l

We might have hoped to obtain an exact formula for Var[Fhull], but there are still reasons to be grateful. First, this bound is likely to be of the right order, and, in any case, it provides the first nontrivial bound for the variance of the number of faces of the convex hull of the graph of a onedimensional random walk. It remains a challenge to see if one can obtain an analogous estimate for the number of faces of the convex hull of the twodimensional random walk {Z1 + Z2 + + Zk: 1 < k < n} in the case where Baxter's method provides the expected value.

1

J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249 247

One should also note that the concentration inequality (16) also provides information about the

hu concentration of Fn 11 about its mean, and just for the record we note that it implies

^F hull E hull ]l > eE[F hull,) <, 4 exp( _82 /4) for all 0 < e < 1 (21)

n Wn' n 2*

t

Finally, since we have a central limit theorem (2) for Fn0P and Fbo', we might expect to have a

corresponding theorem for their sum. Such a result would be immediate if we had an asymptotic

understanding of the correlation of FntOP and Fnb'ttom, but, at present, no such understanding of the

joint distribution is available.

Finally, we should note that the article by BamdorffNielsen and Baxter [2] provides interesting information about one of the most natural functionals of the convex hull of the twodimensional random walk, the area. Regrettably, the bijection provided by the BohenblustSpitzer algorithm does not seem to be able to contribute to the understanding of the area functional, even when we restrict our attention to the convex hull of the graph of a onedimensional random walk.

12. Concluding remarks

The only way that Spitzer used the bi.jection given by the BohnenblustSpitzer algorithm was in his study of the maximal process M,, = max(O, S,, S2,..., S,) for the random walk {Sk}, but the aim of this article has been to show that the bijection can be used to obtain useful information about many other geometrical features. Specifically, we have seen that the BohnenblustSpitzer bijection can be used to obtain information on the length of the concave majorant and the number of faces of this majorant. For the latter, we also found the sequence of distributional identities

F, 1 R, 1 v. _d Tn

that relates F, to the number of records in a random permutation, number of cycles in a random permutation, and the random variable Tn given by the simple Bemoulli sum (13). Almost all of the information one might want about F, is easily read from these identities, except for the crucial information about the joint distributions of the process {F,} that might help us resolve the interesting question of a strong law {FnIlogn}.

Traditionally, the result from [19] that most people have regarded as key has not been the BohnenblustSpitzer bijection but rather the analytical formula

00

1 + 1: 0,,(0)tn = exp 0"(0) tn (22)

n=l L,l 1

that relates the pair of characteristic functions

O~(0)=E[exp(i0S,+,)] and 0.(0)=E[exp(i0M.)].

This formula has many important consequences, and, as the essay of Kesten [ 131 makes clear, almost all of the investigators who follow Spitzer [19] have taken this formula as their starting point. One example in this tradition that prompted much further work is the instructive article of Wendel [22] which first gives a direct proof of Spitzer's Formula by analytic methods and subsequently shows how

248 J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249

one may reverse Spitzer's argument to prove Spitzer's combinatorial lemma from the exponential formula (22).

Here we have focused entirely on the combinatorial, algorithmic, and geometric aspects of the BohnenblustSpitzer bijection without any attention to the analytic approach, but we should also note that there are also interesting geometric results that can be cast in an analytic form analogous to Spitzer's identity. In particular, one can. show that the exponential relation (22) also holds for the more geometric pair

~,,(0) = E lexp(i0~k2 + S,2)] and ~,,(0)=E[exp(i0L,,)], (23)

where, as before, L, denotes the Euclidean length of the concave majorant.

To be sure, some of the results that we have found by other means can also be obtained via the exponential relations for (4, 0j and ~,). For example, Spitzer obtained the expectation

E[max(O,SI,S2,...,S,)]= 1 E[S~']. (24)

1: _k

k=l

by differentiating the identity (22) and setting 0 = 0, while we obtained this expectation (24) in Section 6 by summing over cycles. Also, in parallel to Spitzer's derivation of (24), one can differentiate

the exponential formula for the pair 0j to obtain an alternative derivation of our formula

n

= E IE [~k2 + 2] (25)

E[L.1 k Sk

k=l

Nevertheless, the exponential formulas do not give up their secrets willingly, and progress was slow

even in the case considered by Spitzer where special features of S,+, bring powerful, wellstudied,

WienerHopf techniques smoothly into play. Thus, despite its attractive prospects, the analytical

investigation of the exponential formula for the geometric pair is best left for another time

and place.

This exposition of the SpitzerBohnenblust algorithm and its applications has touched on the theory of record values at several points, but there is much more that could be said. In particular, the early survey of Glick [81 and the uptodate survey of Bunge and Goldie [61 give many beautiful properties of record times that suggests cognate questions for the geometry of the concave majorant. On the other hand, there have been no combinatorial extensions of Spitzer's lemma except for that in [51, which provides circumstances where generalized means may replace the simple averages used by Spitzer. Moreover, Goldie [9, p. 172], has observed that Brunk's theorem can be used to give an alternative approach to equality of the distributions of Fn and R,, and this naturally suggests that Brunk's results may help with the investigation of other features of the convex majorant. Nevertheless, Brunk's theorem is rather complicated, and such investigations do not look easy. Finally, it may be useful to note that Groeneboom [ 111 and Pitman [ 14] have discovered a number of delicate connections between the convex minorant (or concave majorant) and the geometry of Brownian motion paths, although, for the moment, one cannot say what these interesting continuoustime results may tell us about the BohnenblustSpitzer algorithm, or viceversa.

J. M. Steele I Journal of Computational and Applied Mathematics 142 (2002) 235249 249

References

[1] R. Arratia, S. Tavar6, The cycle structure of random permutations, Arm. Probab. 20 (1992) 15671591.

[2] 0. BamdorffNielsen, G. Baxter, Combinatorial lemmas in higher dimensions, Ann. Math. Statist. 34 (1963) 313325.

[31 G. Baxter, A combinatorial lemma for complex numbers, Ann. Math. Statist. 32 (1961) 901904.

[41 G. Bennett, Probability inequalities for sums of independent random variables, J. Amer. Statist. Soc. 57 (1962) 3345.

[5] H.D. Brunk, A generalization of Spitzer's combinatorial lemma, Z. Wahrscheinlichkeitstheorie 2 (1964) 395405.

[6] J. Bunge, C.M. Goldie, Record sequences and their applications, in: D.N. Shanbhag, C.R. Rao (Eds.), Handbook of Statistics, Vol. 19, Elsevier, Amsterdam, 2000.

[71 B. Dershowitz, S. Zaks, The cycle lemma and some applications, European J. Combin. 11 (1990) 3540.

[8] N. Glick, Breaking records and breaking boards, Amer. Math. Monthly 85 (1978) 226.

[9] C.M. Goldie, Records, permutations and greatest convex minorants, Math Proc. Cambridge Philos. Soc, 106 (1989) 169177.

[10] R.L. Graham, F.F. Yao, Finding the convex hull of a simple polygon, J. Algorithms 4 (1983) 324331.

[11] P. Groeneboom, The concave minorant of Brownian motion, Ann. Probab. 11 (1983) 10161027.

[121 M. Kac, Toeplitz matrices, translation kernels, and a related problem in probabililty theory, Duke Math. J. 21 (1954) 501509.

[131 H. Kesten, Frank Spitzer's work on random walk and Brownian motion, Ann. Probab. 21 (1993) 593607.

[141 J.W. Pitinan, Remarks on the convex minorant of Brownian motion, in: E. Cinlar, K.L. Chung, RX. Getoor (Eds.), Seminar on Stochastic Processes, 1982, Progress in Probability, Vol. 5, Birkhduser, Basel, 1983, pp. 219227.

[151 A. R~nyi, On the extreme elements of observations, MTA III Oszt. K8A 12 (1962) 105121 (also available in P. Turfin (Ed.), Selected Papers of Alfr6d R6nyi, Vol. 3, Akademiai Kaid6, Budapest, 1976, pp. 5066).

[161 L.A. Shepp, S.P. Lloyd, Ordered cycle lengths in a random permutation, Trans. Amer. Math. Soc. 121 (1966) 340357.

[17] H.S. Snevily, D.B. West, The bricklayer problem and the strong cycle lemma, Amer. Math. Monthly 105 (1998) 131143.

[18] T.L. Snyder, 1M. Steele, Convex hulls of random walks, Proc. Amer. Math. Soc. 117 (1993) 11651173.

[19] F. Spitzer, A combinatorial lemma and its application to probability theory, Trans. Arner. Math. Soc. 82 (1956) 323339.

[20] F. Spitzer, H. Widom, The circumference of a convex polygon, Trans. Arner. Math. Soc. 12 (1961) 506509.

[211 T. Suidan, Statistical properties of convex minorants of random walks and Brownian motions, Preprint, Department of Applied and Computational Mathematics, Princeton University (www.math.washington.edu/prob/Letters/letter56. html).

[22] LG. Wendel, Spitzer's formula: a short proof, Proc. Amer. Math. Soc. 82 (1958) 905908.

1