77te A nnals of Probability

1979, Vol. 7, No. 2, 267275

HAMMIERSLEY'S LAW FOR TEE VAN DER CORPUT SEQUENCE:

AN INSTANCE OF PROBABILrlY THEORY FOR

PSEUDORANDOM NUMBERS'

BY A. DEL JUNCO AND J. MICHAEL STEELE

Ohio State University and Stanford University

The analogue of Hammersley's theorem on the length of the longest monotonic subsequence of independent, identically, and continuously distributed random variables is obtained for the pseudorandom van der Corput sequence. In this case there is no limit but the precise limits superior and inferior are determined. The constants obtained are closely related to those established in the independent case by Logan and Shepp, and Vershik and Kerov.

1. Introduction. Pseudorandom numbers form the backbone of computer simulation and Monte Carlo analysis, yet there are essentially no known theorems which make explicit the sense in which pseudorandom numbers are replacements for random numbers. There is probably no general result which can be proved, but there is still a program by which a deeper understanding of the relationship between pseudorandom numbers and random numbers can evolve.

For every pseudorandom sequence and every probabalistic theorem, there is surely some analogue of the probabalistic theorem for that pseudorandom sequence. The qualities of that analogue should then accurately reflect the random qualities of the pseudorandom sequence. By a systematic analysis of these pairs of theorems and sequences, a body of results can be obtained which is capable of resolving, or at least eroding, many of the philosophical and practical questions concerning pseudorandom sequences.

This is no overnight task and there is no obvious place to begin other than with the theorems and sequences which interest one most. We have begun with a theorem due to Hammersley and a sequence due to van der Corput.

To introduce Hammersley's theorem, let Xl, i = 1, 2,. , be independent random variables with uniform distribution on [0, 11. By it, I(Xl, X2, . . . , XJ we denote the cardinality of the largest monotone increasing subsequence of the values Xl(w), X2(w), XJw). Hammersley [2] proved that

1

lim,,,.n21. = C where c is a constant and the convergence is in probability. We are particularly interested in (1. 1) because of the considerable effort which, has been focused on the determination of c. Even before convergence had been proved in (1.1), Baer and Brock [1] had conjectured that c, 2 on the basis of extensive computation. Hammersley [21 gave bounds on c which were improved by Kingman [41, but the

Received September 8, 1977. 'Supported in part by Office of Naval Research, contract N0001476C0475. AMS 1970 subject classificaiiow. Primary 60C05; secondary 65C10. Key words andphrases Van der Corput sequence, monotomc subsequence.

267

i

268 A. DEL "CO AND J. MICRAEL STEELE

deepest results published so far are due to Logan and Shepp [6] who proved c > 2. Vershik and Kerov [71 have recently announced that c < 2 but details of their proof have yet to appear.

In the present context Hammersley's theorem has a natural appeal as a substantive probabalistic result in which there is much current interest. The tempting prospects of obtaining c in the analogue of (1. 1) for some pseudorandom sequence provide even more reason to start with Hammersley's theorem.

One natural candidate for the sequence of pseudorandom numbers is the sequence generated by the multiplicative congruential method. These are the most widely used pseudorandom numbers but there are several drawbacks to their analysis.

In the first place the sequences generated by the multiplicative congruential method are periodic. Since 1,, can be properly studied only for n small compared to the period the asymptotic analysis thus is complicated not only computationally but conceptually.

Primarily for this reason we have begun with the van der Corput sequence. The sequence is generated in a manner almost as simple as multiplicative congruence, but it has the great benefit of being aperiodic. Moreover the sequence has widespread use in numerical integration [3] and considerable historical interest. (See, for example, Knuth's interesting discussion "What is a random sequence" [5] page 127157.)

To define the van der Corput sequence we first write, for n > 0, n = Ei*oai2' where a, = 0 or 1. The n + Ist element of the sequence is W2(n) = Zi"oai2i1. One can see that cp2(0) = 0, q)2(1) = 1, T2(2) . 1, op2(3) = 1, etc. The nature of q?2(n)

2 4 4

is more easily seen in binary notation where 992(n) is the "the reflection of n in the decimal point." As the subscript suggests one can define a similar sequence by qp,(n) = 100 ` 1 where n = l?' oaip', 0 < a,

2 is an integer. 'oaip

We now have the theorem:

If l(n) is the cardinality of the largest monotone increasing subsequence of

.p,(n 1)} then

i i

(1.2) lim inf~,.n 2 l(n) = 21, lim sup,,,.n 2 l(n) forp = 2,

2

and

(1.3)

2

2

lim infn,.n1 l(n) = 2(1 p lim sup,,..n l(n) = pi forp > 2.

This result is as precise an analogue to (1.1) as one could realistically expect.

Also, the LoganShepp lower bound on c makes it particularly noteworthy that

.L

(1.4) limp..lim.inf,,,.n 2 l(n) = 2.

For digestibility, the proof of these results has been divided into four parts. First we obtain a geometrical characterization of the monotone subsequences of van der Corput's sequence. Upper and lower bounds are then obtained for l(n). Finally the required limits are identified.

i i i

1

PSEUDORANDOM HA~RSLEY LAW 269

2. The geometrical representation. Our first goal is to obtain a representation

from which detailed information about {opp(O), 9?P(I), op

,,(n I)) can be de

duced. We define a,, to be the unique permutation of {% 1, n 1} which

puts 9)p(i), 0 ~Q i %Q n 1 in increasing order, i.e.,

qgp(a.(0)) < 9)p(a.(1)) < ... < (pp(a.(n

This permutation will be written as a sequence

a. = (a.(0), a.(1), . . , an(n 1))

and our reason for introducing an is the elementary fact that the length of the longest monotone increasing subsequence of an is equal to 1(n).

We also have the following law of formation of a .. P. which will be crucial.

LEmmA 2. 1. One obtains cr.. from ap. by replacing the entry ap.(i) of up. by the sequence ap.(i) + p no..

REmARK. Here and in the following if x is a sequence then a + bx is a sequence with the same domain as x defined by (a + bx)(i) = a + bx(i).

PROOF OF LEmmA. Notice that there are two things we must show:

+ pna.(j)) < 99

P(a

(2.1) P.(i) + pn,.(j +

and

,.(i) +p nCF.(j)) < (p

P(a

(2.2) P.(i + 1) + pn,.(f)),

for anyiJ.

We first suppose j ~pn and k is any integer. Setting j = l"i0'i2ip', k

E"PnakP 1n we havej + p"k W,_0apl, and consequently

(2.3) pp(j + p"k) 2 T,Oaip (" 1)

Enlap 1) + p n (i+ 1)

10 2 78 n aiP

pp(j) + pnpp(k) for j

pn., np

P(Ujj))

which is just the definition of a, Similarly, (2.2) is equivalent to

(2.4) fp"(op.(i)) + p nrp,(0.(i)) < Ipp(ap.(i + 1)) + p nqgp(a.(j,)).

{O, I'. pn _ 1),.p

To check (2.4) we note that, as i runs through P(i) MnS

through {O, p n, 2p n, . . . ' (p n _ 1)p n } so that, in fadt, qpp(ap.(i + 1)) = p n +

Tp(up.(i)). The proof of the lemma is thus complete.

In many ways it is easier to work with the permutation matrix An associated with an. We recall that An is an n x n matrix defined by

An(iJ) = 1 if i = an(j) =D otherwise.

270 A. DEL SUNCO AND J. MICHAEL STEELE

To rephrase Lemma 2.1 in terms of matrices we define m different p" X mp"

matrices Ap, ' lp'~y APT by

A,,.(i,jm~') if mli

1 otherwise and

1)mod m).

Ap'.(iJ) (j

p

Intuitively, we getP_. by inserting m 1 columns of O's after each column of Ap.

p

and then shift 1~ by 1 places to the right to get

Finally we can phrase Lemma 2.1 by the following formula for A.. as a block matrix:

p

'T>._ 1 (2)

(2.5) Anw

This matrix representation of the permutation permits a geometrical view of the decreasing subsequences. Formally an increasing subsequence of a,, is a sequence of integers P = {jl

3. The lower bound lemma. The main result of this section is the following:

LEmmA 3. 1. For P"2 ;Q m r% p"~' we have

(3.1) 1(mp n) >p n 1 + m(p 1) (3p 1),

andforpn3 < M Q pn2 we have

(3.2) 1(Mpn) >pn3(p _ 1) + MP _ (P2 + 2p _ l).

PROOF. First we will consider m such that pn2 ,Q M 4Q p n 1 and construct an explicit path through A... By setting m, = (pn1 m)l(p 1) and M2 = M M, we have m,p2 + m2p =p n. For 0 < i < [m,] we can define a path P, through Ap. of length 2p 1 and domain [P2i,p2(i + 1) 11 by '

pi = (P2i, p~ + p, p2i + 2p, _ ' p2(i + 1) p,

p 2(i + 1) _,p + 1, 'p2G + l) l).

For 0 Q i < [m21 we then define a path through Ap. with length p and

domain 1 M1]p2 + ip, [M1]p2 + (i + 1)p 1] by

P[m,l+i = Q M1]p2 + ip, [M1]pI + ip + I'. . . ' [M1]p2 + (i + 1)p l).

i i

i i

i

PSEUDORANDOM HA~RSLEY LAW 271

Now we consider the paths F, in ~ '(') which correspond to Pi. These paths are

1 or

defined by mP, + aj 1(i). Finally we define our desired path P through A...

as the concatenation of the

P = POPI ... P[.j+[%]].

To make certain that the path P is properly defined we have to check that for each k e dom lisi~ , 1 and 1 E= dorn Ai we have k > 1. This condition will be abbreviated by domFi+l > domF,, and it follows immediately from the corresponding fact that dom P,+ 1 > dom P,. By our construction we finally have

1(Mpn) > l(P) = [M1](2P 1) + [M2]P

> M1(2P 1) + M2P OP 1)

= P n1 + m(p 1) (3p 1)

which completes the proof of (3.1).

We now consider M'pn3.Q M Q pn2 and begin by setting m, = (p`2

m)l(p 1) and M2 = m m,. This time we have mp 3 + M2p2 = pn. As in our previous construction there is a path of length p 2 through Ap3, so for 0 < i < [m,] there is a path Pi of length p 2 through Ap. with domain [ip3, (i + 1)p3 1 ]. For 0 Q i < [m21 there is a path P[mj+i of length 2p 1 through Ap. with domain 11 M1]p3 + ip2, [M1]p3 + (i + 1)p2 1]. Next we set A, = mPi + ajl(i) and obtain a path through by letting

p =

Finally we calculate as before:

1(,,n) > 1(p) MI ] p2 + [ M2](2P 1)

> M1P 2 + M2 (2p (p2 + 2p 1)

= p n2(p + nP (P2 + 2p 1)

which completes the proof of (3.2).

REmARK. The preceding proof probably appears more involved than an examination of an example would indicate. By calculating A P. for p = 3 and several small values of m and n one can easily find the path P and see how it evolves as m and n increase. Similarly in the arguments which follow one should keep the example p = 3 clearly in mind, perhaps by keeping a small table of the A,..

4. The upper bound lenuna. The recursive nature of A,,. provides the key to the'following, Lemma 4. 1. For any positive integer m we have

(4.1) 1(mp",) Q P` + M(P

and

(4.2) l(Wn) < pn 1 _ pn2 +

i

i i i

1

i

i

1

272 A. DEL "CO AND J. MICHML STEELE

PROOF. For any path P through A... we can decompose P as a concatenation

P = P0P1 ... Pm 1

where P, is a path through AP5 (1). Here one should note that the range of Ai is a subset of [ip.", (i + 1)p" 11.

We now let P, be the corresponding path in Ap. of F,. That is, we let Pi m m l(Ai a,~ 1 (i)). We do not have dom. Pi < dom. P,,, but we obviously do have dom P, < dom P,+, so the two paths have at most one common point. If dom P, n dom Pi+l :,'= 0 we shall say Pi and P,+, are linked. Since I(P) ~ W1 P) the

i0 1( lemma would follow from E' '1(Pi) < min(p` 1 + m(p 1), p, 1 p" 2 + i0

mp). This fact will be proved using only the inequality dom Pi < dom. Pi + 1.

First we decompose Ap. into p" blocks of height p` and width p. We note that each block has exactly one 1, and the pattern of I's in each column of blocks is the same as the pattern in Ap.

The block width w(P) of a path P is defined as the number of distinct columns of blocks which intersect P. The block height h(P) is defined correspondingly. Further, we let g equal the number of integers i such that Pi is linked to P,,,.

If P, and P,+ 1 are linked we define their linkage as the concatenation PiPi*, 1 where Pi*+ 1 denotes P, with its first entry removed. By forming the successive linkage of the P, we obtain m tt paths P11A, Pm, which are unlinked and for which

(4.3) + li~i(Pi)

Next we form all possible concatenations of successive P, to get s paths P,', P2, P.,' which allow no further concatenation. One then has s < m tt and

(4.4)

1 I(Pil) mi p,

Moreover, one notes that no two P,' share a common column of blocks for the simple reason that dom, Pi < dom P,, , and the 1 in any block is to the right of the 1 in any block above it.

The crucial observation is that for each path Pi' we have

,(P,') < h(Pi') + w(Pi') 1 < p + w(Pi') 1

which by summing over i yields

(4.5) E~ 1(Pi') < ES1w(p,') + S(P I).

1 1 i

Since no two Pi' share a column of blocks we have

13_ IW(pi,) < pn 1.

(4.6) 1

By the inequality (4.5) we obtain

(4.7) 2:~1 1(pi,) < pn1 + S(p

which by settings = m tt X with X > 0 using (4.3) and (4.4) becomes

Em~ 11(pi) < pn1 + (M X)(p 1) tt(p 2).

(4.8) i0

i

i

1

1

PSEUIDORANDOM HA~RSLEY LAW 273

This completes the proof of (4. 1) since X > 0 and tt > 0. On the other hand (4.6) also yields s < p n 1 so tt < p ` 1 (m X) which by (4.8) implies

EM1 + (M X)(P 1) + (pn 1 (m X))(p 2)

i_ol(pi) Q pn p n p n1 +

This implies 1(Mpn) .1~ p" p n` + m and by replacing n by n 1 and m by nw we obtain the proof of (4.2). This completes our proof of the upper bound lemma.

By applying the preceding lemmas, we should note that in certain cases the inequalities can be made to provide equality. By inspecting the proof of (3.1) in the cases of M = p n 1 and M = p n 1 one sees that it is not necessary to estimate greatest integers so that the 3p 1 disappears from the right of (3.1). Combining this with (4. 1) we have the identities

(4.9) 1(p2nl)=pn and I(P2n 2) =p n2 (2p 1).

By similar analysis for p 2 we also have the identity

(4.10) 1(m2n) 2` + m for 2n2 < M < 2n1

2` + 2m for 2` Q m ,Q r'.

5. Identification of the limits. To complete the proof of our theorem, we first

1 S = (,,n : pn1 %Q M identify lim supn~.N 21(N) where N. is restricted to

p n1 ).

LEmmA 5. 1. Setting K = max(2(1 p )12, p 2L, 2 p') and k = min(2(1

1).L, 1,

p 2 P2 2 p'), we have

(5.1) "M SUPN c: sN I(N) K,

and

(5.2) lim infN c: sN 21 I(N) k.

PRooF. As in Section 3 we deal with p n 2 ,;;

. M .~ pn 1 (case 1) and pn3 ,Q M 4Q p n2 (case 2) separately. This time we begin with case 2. By (3.2) and (4.2) we have

g.(M) (p2 + 2p + 1)(.n)12 ~ 1(npn)(Mpn)21,Q gn(M)

2(p 2

where gn(x) = (p` _ 1) + Xp)(Xp n) ~ 1. Consequently we have

liM inf I(Mp.)(Mpn)L = lim inf gn(m),

(5.3)

where the lim inf is taken over the set S, = {,,n :pn3 ,Q M :,Q pn2, 1 < n). One has the same equation as (5.3) for, the lim sup.

Next observe g,(m) has its minimum at a,, =p n 3(p 1) and gn is decreasing

3 <

for p` Q x ,' an and increasing for an . X Q pn2. One easily checks that

.L g,(pn3) = p21, g,(pn2) . 2 p', and gn(a.) = 2(1 p)2 so we have

(5.4) lim infw~ s, g,,(m) = k and lim sup,,p. s, gn(m) = K.

274 A. DEL "CO AND J. MICHAEL STEELE

To prove the comparable identities for case 1 one sets f,(x) = (p n 1 + X(p

1))(xp")'2L and obtains from (3.1) and (4.1) that

f,.,= S'l(np n)(Mp n) 1 = liM SUp P.'S2f

lim in 2 "(M)

Where S2 = {Mpn : pn2 < ,n 1, n > 1}. The same equality holds for the limsup, and we note as before that f. has a minimum at p n 1(p l)` = bn, decreases for p n2 ,Q X '~ bn and increases for bn Q X ,Q pn '. Finally we note fn(bn) = 2(1

)I, i, f 1)

, f 2

p n(pn~ 1) = p and n(pn 2 p Hence. we have

(5.5) lim infmp.,=sfn(m) > k and lim supw.,=s~fn(m) ~Q K.

One has inequality in (5.5) since bn is not an integer. Actually equality can be

proved but is not required for the rest of the proof. By (5.3), (5.4) and (5.5) we have

completed the proof of (5. 1) and (5.2) as required by the lemma.

The proof of our theorem can now be completed in a routine way. For an arbitrary integer N we write

Mp n pn3 Q M

(5.6) N = + r with . pn 1 and 0 4Q r < pn+2.

Letting 1(i,j) denote the cardinality of the largest increasing subsequence of

(pp(i + 1), fpp(j 1)) we see

(5.7) 1(ipk, (i + 1)pk) = 1(pk)

as a consequence of (2.3). There is also the obvious fact that l(iJ) is subadditive,

1(i, k) < l(iJ) + 1(j, k) for i ~jQk.

By choosing an integer t such that tp"12 <.. np" and ,n +pn+2 (t + 2)p +2 one has by (5.6) that

l(N) ,Q I(Mpn + pn+2)

< I(Mpn) + I(Mpn, Mpn + pn+2)

.Q I(Mpn) + l(tpn+2, (t + 2)pn+2)

I(Mpn) + 21(pn+2).

Hence we have

(5.8) 1(Mpn) :Q I(N) Q I(Mpn) + 21(pn +2).

1(pn+2)1(Mpn)i

Since for large N one has NImpn is near 1 and 2 is near 0, the

inequality (5.8) together with Lemma 5 completes the proof of the theorem.

REFERENCES

[11 1)AER, R. M. and BRocK, P. (1968). Natural sorting over permutation spaces. Math. Conw. 22 385410.

[2] HA~RsLEY, J. M. (1972). A few seedlings of rewarch. Proc. Sixth Berkeley Synw. Math. Statist. Probability 1345394. Univ. of California Press.

[3] H~RsLEY, J. M. and ~Scoms, D. C. (1964). Monte Carlo Metho*. Methuen, London.

i i i i i

i

1

1

PSEUDORANDOM HAMNERSLEY LAW 275

[4] KiNomAN, J. F. C. (1973). Subadditive ergodic theory. Ann. Probability 1883899.

[5] KNum, D. E. (1969). The Art of ConWuter Progra~hg, 2 (Semintonerical Algorithm). AddisonWeslay, Palo Alto.

[6] LooAN, B. F. and SHEPP, L. A. (1975). A variational problem for random Young tableaux. Advances in Math. 26 206222.

[71 VERsHiK, A. M. and KERov, S. V. (1977). Asymptotics of the Plancherel measure of the symmetric group and the limiting form of Young tables. Soviet Math. Doki. 18 527531.

DEPARTMENT OF MATHEMATICS DEPARTMENT OF STATISTICS

OHIO STATE UNWERSITY STANFORD UNIVERSIrY

CoLumBus, 0wo, 43210 STANFORD, CALiFoRmA 94305

i

i

1