PROCEEDINGS OF THE

AMERICAN MATHEMATICAL SOCIETY

Volume 80, Number 1, September 1980

SHORTEST PATHS THROUGH PSEUDORAMOM

POWS IN THE dCUBE

J. MICHAEL STEELEI

ABSTRACT. A lower bound for the length of the shortest path through n points in [0, 1]' is given in terms of the discrepancy function of the n points. This bound is applied to obtain an analogue for several pseudorandoin sequences to the known limit behavior of the length of the shortest path through n independent uniformly distributed random observations from [0, lld.

In~ction For a sequence fz,: 1 iQ i < oo} of elements of [0, 1]d ~ d > 1, to be considered as a candidate for a sequence of pseudorandom observations, it must at least be uniformly distributed in the sense that the empirical frequency of any subcube must tend to the volume of that subcube [5, pp. 1271571. For such a minimally pseudorandom sequence, it seems of interest to determine the extent to which additional features of independent uniformly distributed random variables must also hold. The additional feature which is considered in this note is the asymptotic growth rate of the shortest path through the initial sample {Z11 Z21 ... 1 Zn}'

For a sequence {x,, 1 ,Q i < oo} of independent random variables uniformly distributed in [0, ly, Beardwood, Halton and Hammersley [1] proved that X(n)

X(xl, x2, . . . , x.), the length of the shortest path through {xl, x2, x.}, has the

property that

lim X(n)1n (d 1)1d . kdg

n3100

with probability one for a constant kd > 0.

While a result of such precision is too much to expect of a general pseudorandom sequence, a somewhat weaker version of (1) can be obtained in sufficient generality to cover a variety of classical cases. To make this precise, first suppose [0, Ild is partially ordered (..r<) by the usual coordinatewise ordering, and for each x EE [0, 1]d, define the empirical distribution function

F,,(x) (11n)*{1 .Q i .~ n: z, << x}

of the sequence {z,: 1 < i < oo}. The discrepancy function D. is then defined by

D. = sup IF,,(x) F(x)J,

X EE [0,1]d 1

Received by the editors June 29, 1979 and, in revised form, November 8, 1979. 1980 mathenwtics Subject Clavification. Primary 65C10, 90B10, I0F40, 601)05. Key words andphrases. Discrepancy, pseudorandom, shortest paths, uniform distribution. 'This research was supported in part by the National Science Foundation, Grant No. MCS7807736.

0 1980 Arnerican Mathematical Society

000299 3918010000042 3J$02.2 5

130

PsEuDoRANi~ Ponm iN nw dCUBE 131

where F(x) is the volume of the region {y e [0, 1 Y: y c x}. For any pseudorandom sequence we may suppose that lim.,. D. 0. It will be by relating the convergence rate of D. to the growth rate of X(n) X(Z11 Z21 ... 1 Z.) that the analogues of (1) can be proved.

Since L. Few [2] has proved that there are constants. cd' such that for any fz,:

< i < oc, z, C [0, ly} and all 1 < n < oo

(d 1)1d

X(n) X(zI, z2, . . . , z.) < cdn

the limit theory for X(n) will depend upon obtaining an appropriate lower bound. The following elementary bound suffices in many cases and is in a sense the best possible.

THEop.Em 1. There are constants cd > 0 such that X(n) > CdDj (d 1)1d for uniformly distributed sequence and any sufficiently large n.

PRow. Divide [0, ly into k d Cubic cells of edge Ilk, and further divide each of these into 3 d Cells of edge 113k. The 1/3kedge cells which do not touch the boundary of the 1/kedge cells will be caned center cells. We note that if each of the center cells is occupied by at least one element of zI, z> . . . , z., then X(n) > (2/3k)(k d 1), since there are k d center cells, and two points in different center cells must be at least 2/3k apart.

Next note that if D,, < 2d(3k)d, then each of the 1/3kedge cells must be occupied by at least one z,, i 1, 2, . . ., n. On choosing k so that 6d(k + j)d ,C D,, < 6~dkd, we finally have

X(n) > (2/3)kd1 > DJ(dlVd(k16(k + 1))d '(2/3),

which easily implies the theorem.

The main consequence of Theorem 1 is that together with Few's upper bound one obtains the following analogue of the BeardwoodHaltonHammersley theorem.

CoRoLLARY 1. If nD,, 0(n') for all e > 0, then

lim log X(n)/log n (d 1)1d. (2)

n*00 4

There is extensive literature devoted to the determination of the discrepancy of special sequences, and many of these satisfy the hypotheses of Corollary 1. Several such examples are considered below, but for a genuine survey of discrepancy bounds of pseudorandom sequences and their applications in numerical analysis one should consult Niederreiter [11]. In the first example we recall Halton's important generalization of the van der Corput sequence, but to avoid notational digression we omit its explicit description.

ExAwLE 1 (HALToN [3D. For any d > 1 there is an infinite sequence in [0, ly such that nD. Offlog n)d).

The next two examples are of a more theoretical character.

ExAwLE 2. Let {ka) denote ka reduced modulo 1, and set zk

({kal}l {ke'2}1 ... 1 {kad}) Where 1, al, %, ... , ad are real algebraic numbers which are linearly independent over the rationals. Neiderreiter [8] has proved

132 J. M. STEELE

nD,, = 0(n') for all e > 0 by using an extension of the ThueSiegelRoth theorem due to W. Schmidt [131.

ExAmPLE 3. (W. ScHmiDT [121). For all (a, a21. ad) EE [0, ly, except a set of Lebesgue measure zero, the sequence zk = ({kal}, {ka2}1 {kad}) satisfies nD, = Offlog ny +1+8 ).

Each of the preceding sequences has the benefit of being infinite, but the most widely used pseudorandom sequences do not have this property. The linear congruential pseudorandom numbers (LCPRN) are periodic with period m. This periodicity makes a limit result like Corollary 1 infeasible; but since bounds on D. for LWRN are known (Neiderreiter [91, 110D, the proof of Theorem 1 would yield a bound on X(n) for each n < m. Experience with LCMN's indicates that in this case one should be able to obtain more precise information by direct analysis than by an application of discrepancy bounds, so this example has not been pursued.

One should similarly note that discrepancy estimates fall short of Corollary 1 for independent and uniformly distributed random variables. By Kiefer's [4] law of the iterated logarithm nD,, = 0((n log log n)'/2) with probability one, so by Theorem 1 and Few's bound one obtains only

(d 1)1d > lim sup log X(n)/log n > lim inf log X(n)/log n > (d I)/2d.

Despite these two disappointments, Theorem 1 is essentially the best possible. In particular, the next result shows that if there is no estimate on the rate at which D. tends to zero, the most one can say about X(n) is that X(n) > oo.

THEOREm 2. Given any positive 4o(n) for which .0(n) > oo, there is a uniformly distributed sequence for which, for all integers n > 1,

X(n) .Q 0(n). (3)

PROOF. For any finite sequences define their product by

(X11 X21 ... 1 Xk) * (X11 X2'~ ... 1 X1') = (X11 X2Y ... 1 Xk, X'11 X2', ... 1 X1D

and then define the powers (xl, x> . . Xl)*k correspondingly. Our basic lemma is the following:

For any uniformly distributed sequence (x,) in [0, ly and any integers k, for which ik, increases, one has, for (y,),' 1 defined by

(Y11Y21 ... ) = (Xl)*k. * (Xl, X2)*k2 . (Xl, X2, X3)*k,

that (y,)" 1 is uniformly distributed.

To prove the lemma let D.' and DY, denote the respective discrepancies of (x,)T and (y),' Now for any positive integer s there is a p such that

p

s iki + (p + 1) q + r where 0 < q < kp + ly 0 < r < p + 1. (4)

We therefore have for (y),?' 1 that

p

sD3Y < 1 jkDjx + q(p + 1)Dx+1 + rDx.

J1

PSEUDORANDOM PoiNTs iN Tim dCUBE 133

Letting xp(j) = supk., ', we see 0(j) is nonincreasing and tends to zero by the

~,jDk uniform distribution of (x,),~' 1. Since D,.' Q 1 and q(p + 1)1s < 1,

p

Dsy < 1 ip(j)(jkjls) + 0(p + 1) + (p + 0/

J1

To show D.Y tends to zero it thus suffices to prove the same for the sum EjP_ 1 ip(j)Uk.1s). For this recall Steffensen's inequality [7, p. 1421.

If a,, bi, cj are positive, cj nonincreasing and 7. a, < E b, for 1 < m < p, then 1P, 1 c, a, < 1 PJ , c, b,.

In our case c, = 0(i), ai = ikIs and b, = 1 1p. The facts that Ef. 1 iki < s and

that ik, is nondecreasing provide the hypothesis 1%, a, < E 'i, bi, 1 < m < p.

Consequently we have EP,,(;p(i)ik,)ls < and since tp(i) tends to

zero, so do both sums. Together with (5) this completes the proof of the lemma.

To prove the theorem we estimate Xy(n) in terms of X'(n) and (k,),t 1. First we note the elementary relations

1/2

X(Xl, X21 Xp X;, x2, . . . , x.) < d + 'X(Xl, X21 1 X1) + X(X11 X2'Y 1 X,',)'

X((Xl)*k) = 0 and X((xl, x2, . Xj*k X(Xly X21 1 X.).

For s as in (4), one thus obtains

p+1

Xy(s) < (p + 1)d 1/2 + X(Xl, X21 ... 1 Xi). (6)

By Few's inequality, X(xl, x2, x) < C~(i)(d 1)1d so (6) certainly implies Xy(s) < p 2 for p sufficiently large. Now for any 0(n) which is integer valued and nondecreasing, we can choose kitoo so that k,,(,,) > n for all n. Since s > EP 1 iki > kp and s < k,(,), we have 0(s) > p by the monotonicity of the ki. By the bound AY(S)

REFERENCES

1. J. Boardwood, J. H. Halton and J. M. Hammersley, The shortest path ~ manY Points, Proc. Cambridge Philos. Soc. 55 (1959),299327.

2. L. Few, The shortest path and the shortest road through n points, Mathematika 2 (1955), 141144.

3. J. H. Halton, On the efficiency of certain quasirandom sequences ofpoints in evaluating nwitidfimensionai integrals, Numer. Math. 2 (1960), 84~90; correction, p. 196.

4. S. Kiefer, On the large deviation of the enwiric d.f. of vector chance variables and a law of the iterated logarithm, Pacific J. Math. 11 (196 1), 649660.

5. D. E. Knuth, The art of coWuter programming, Vol. 2, Seminumerical algorithms, AddisonWesley, Palo Alto, California, 1969.

6. L. Kuipers and H. Niederreiter, Uniform distribution of sequences, Wiley, Toronto, 1974.

7. D. S. Ifitrinovi~, Analytic inequalities, SpringerVerlag, New York, 1970.

8. H. NiederMter, Application of dioPhanti aPproximations to numerical integrations, Diophantine Approximation and its Application (C. F. Osgood, ed.), Academic Press, New York, 1973, pp. 129199.

134 J. M. STEELE

9.Statistical independence of linear congruentialpseudorandom numbers, Bull. Amer. Math. Soc. 82 (1976), 927929.

10.Pseudorandom numbers and optimal coefficients, Advan~ in Math. 26 (1977), 99181.

11.Quasimonte carlo methods andpseudorandom numbers, Bull. Amer. Math. Soc. 84 (1978), 9571041.

12. W. Schini^ Metrical theorems on fractional parts of sequences, Trans. Amer. Math. Soc. 110 (1964),493518.

13.Simultaneous approximation to algebraic numbers by rationals, Acta Math. 125 (1970), 189201.

DEPART~ op STATisncs, ST~oRD UN~rry, STANFoRD, CALiFopmu 94305