Probab. Theory Relat. Fields 92, 247258 (1992) Probability

Theory R. Fields

SpringerVerlag 1992

Asymptotics for Euelidean minimal spanning trees on random points

David Aldous and J. Michael Steele ,

Department of Statistics, University of California, Berkeley, CA 94720, USA

2 Department of Statistics, Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA

Received March 1, 1991; in revised form September 3, 1991

Summary. Asymptotic results for the Euclidean minimal spanning tree on n random vertices in R` can be obtained from consideration of a limiting infinite forest whose vertices form a Poisson process in all R'. In particular we prove a conjecture of Robert Bland: the sum of the d'th powers of the edgelengths of the minimal spanning tree of a random sample of n points from the uniform distribution in the unit cube of R` tends to a constant as n oo.

Whether the limit forest is in fact a single tree is a hard open problem, relating to continuum percolation.

1 Introduction

Let x = {x x,,} be a finite set of points in R', d ~: 2. The minimal spanning tree (MST) t of x is a connected graph with vertexset x such that the sum of the edgelengths of t is minimal, i.e.

E lel = min 5, lel

ect GeeG

where 1 e 1 xi xj is the Euclidean jength of the edge e = (xi, xj) and the

minimum is over all connected graphs G. Minimal spanning trees are one of the most studied objects in combinatorial optimization, and even the probability theory of MST is rather well developed. See Steele [13] for an overview. In that paper it was shown that, if (xi: 1 :5 i::5 n) are i.i.d. with compactly supported density f and if 0 < oc < d, then

n (d a)/d E 1 el' c(a, d) f (f(X))(d oc)ld dx a.s.

eet Rd

*Research supported by N.S.F. Grants MCS8701426 and MCS 90017 10 ** Research supported in part by N.S.F. Grant DMS8812868, A.F.O.S.R. Grant 890301, ARO Grant DAAL0389G0092 and NSA Grant MDA904H2034

248

where c(a, d) depends only on a and d. The most concrete goal of this paper is to extend partially this result to the extreme case a = d (see Proposition 3a). The proof of (1) used subadditivity, a technique going back to the original probabilistic analysis of the traveling salesman problem by Beardwood et al. [4], who stated that the a = 1, d = 2 case of (1) could be proved by their method, This paper uses a completely different approach. We study the analog of the MST for an infinite set of points, the points of a Poisson process on all Rd. Since empirical distributions of i.i.d. sequences look locally like Poisson processes, we can relate their MST to the MST of the unbounded Poisson process. This approach also gives another proof of the existence of a limit distribution for degrees of vertices in the MST (Proposition 3b).

To formalize this approach, let x = (xi) be a finite or countably infinite subset of Rd, d 2. Call x nice if

(i) x is locallyfinite, i.e. has only finitely many elements in bounded subsets of R d ; and

(ii) the interpoint distances (1 xj xi 1, i < j) are all distinct.

Given a pair (x, x) with x nice and x e x we can define trees t,, (x, x) with vertices

from x as follows. Let ~, = x, and let t, be the single vertex ~1. Let t, be the tree

consisting of the vertex ~ 1 and the vertex ~2 C X\ {~ 1 } which is closest to ~ 1 in

Euclidean distance, together with the edge (straight line segment) connecting ~1

and ~2. Inductively define t t. (x, x) to be t,,, together with a new edge

(~j_, ~.), where j. < m 1 and ~,, L= x\{~ are chosen so that the

edgelength 1 ~,, ~j_ 1 is minimal (over all possible edges connecting t,, 1 to

x\t,,, In the finite case where n x 1 < oo, this procedure terminates with the

tree t,,(x, x). Here we have described a variant of the wellknown greedy algorithm

for constructing the Euclidean MST (see Sect. 4). That is, in the finite case t(x, x) is

the unique Euclidean MST on vertices x, and so in particular it does not depend on

the choice of the starting vertex x.

What happens when x is infinite is less simple. In the infinite case, write t. (X, X) = U n tn (X, X).

Lemma 1 Let g = g(x) be the graph on an infinite nice vertexset x defined by taking (xl, X2) as an edge in g if it is an edge in either t,,, (xl, x) or t., (x,, x). Then the graph g is a forest and each component of g is an infinite tree,

This lemma is proved in Sect. 2. An equivalent definition of g is given in Lemma 12. The central character of the paper is the random tree g defined as follows. Take a Poisson point process X = {qi} of rate 1 in R, d ~t 2. Let 0 = X u M. Throwing away a null set, X' is a random nice subset of iR Let W = g (X0) be the forest constructed via Lemma 1.

It is natural to conjecture that 9 is in fact a.s. a tree, but this seems to be related to deep issues in continuum percolation (Sect. 5). Here we finesse the issue by writing g for the component of 9 containing the vertex 0. The next lemma records some simple facts about Y. This lemma and Proposition 3 will be proved in Sects. 2 and 3.

Lemma 2 Let D be the degree of vertex 0 in Y. Let L,, . . LD be the lengths of the edges of g incident at 0. Then

(a) D :~ bd, a constant.

(b) EI? 2.

(c) ld = 7', i EL~ < oo.

2 1

1

i

i i

i~

1

Random minimal spanning trees 249

Now let X,, denote the point process consisting of n points (Ili: 1 5; i t~ n) that are independent and uniformly distributed on the unit cube [0, l]`. Throwing away a null set, X,, is a nice subset of Rd. Let Y,, = t,,(nl, XJ be the MST on these n vertices. It is intuitive that Y,,, suitably rescaled, converges locally to 9. This idea is formalized in Proposition 9 and leads to a proof of our main result.

Proposition 3 (a) Let fl ei I) be the lengths of the edges of Y,,. Then

n1 L 2

ld as n > oo .

lei I d

(b) Let A, j be the proportion of vertices of 9% with degree i. Then for each i

EAj+^D=i) as n+oo.

Discussion 1. This idea of getting asymptotics by considering a kind of MST on the Poisson limit process is rather natural to modern theoretical probabilists. The purpose of this paper is to provide the details, which are slightly less obvious than the first author thought initially. Additional motivation was provided by the emergence of Conjecture 13 below, which seems both interesting and difficult. Our results in themselves do not provide useful numerical bounds for the limiting constants. In work complementary to the present paper, Avram and Bertsimas [3] give a unified treatment of the Euelidean model and the i.i.d. model (see below) and discuss numerical bounds; they do not explicitly use the Poisson limit process. As remarked in Steele [13], for d = 2 the existence of a limit in (a) was conjectured by Robert Bland on the basis of simulations. Indeed, in this case there is an absolute [0, 1]2.

bound on e 1 1 regardless of n or the positions of the n points in

2. In the setting of (1), where the MST is built from i.i.d. points with some density with compact support K, it seems intuitively clear that

n1 L2

lei ld ld 1 K 1 asn .> oo

where ]K 1 is the ddimensional Lebesgue measure of K. But we have not attempted to write out the details.

3. A simpler "i.i.d. model" for MSTs discards the geometry of R d and supposes that the interpoint distances are i.i.d. with some distribution ~. The analog of part (a) of Proposition 3 has been wellstudied by combinatorial methods in the case where ~ has uniform distribution. See Tirriofeev [ 15], Avram and Bertsimas [3] and Aldous [2] for recent results on more general distributions.

4. The existence of limits in (b) was proved by different methods in Steele et al. [1.4], who also proved a.s. convergence. The same question in the i.i.d. model was solved explicitly in Aldous [2], using 1imit process" arguments in the spirit of this paper.

5. Proving central limit theorems in this area seems technically difficult. Rarney [12] shows that the CLT for total length of the Euelidean MST can be reduced to a technical "conditional independence" conjecture.

2 Technicalities

Let us first record one simple fact.

250 D. Aldous and J.M. Steele

Lemma 4 There is a bound bd On the degree of any vertex in any Euclidean MST

d in R .

Proof. An easy application of the triangle inequality shows that (in any dimension) two edges of a MST meeting at a vertex cannot make an angle of less than 60'.

We now work toward the proof of Lemma 1. Let cl be the graph with vertexset x and such that (xi, xj) is an edge of cl iff ~ xj xi 1 < 1. Write t ' (x) for t, (x, x).

Lemma 5 If (y,, y,) is an edge of t., (x)for some x e x, then it is also an edge ofeither t.(yl) or tAY2).

Proof. In the construction of t.(x)with vertices (x we may suppose

that the edge (y 1, Y2) occurs as (~i, ~j), say, with i < j and has length 1. If the edge is

not in t,,(~i) then the component of cl containing ~i must be infinite; but then the

edge cannot appear in t,,,(x).

Proof of Lemma 1. By Lemma 5 and the definition of g, a component of g which contains a vertex x must contain all edges of t., (x). Thus all components of g are infinite. Next, suppose g contains a circuit (Y1, Y2, Yk, Y1) with the maximal edgelength attained by the edge e = (Yk, Y1), say. Consider t.' (yl). This cannot contain the edge e, because the algorithm would first have added the vertices {Y2, . . . , Yk} which can be connected with y, using shorter edges. Similarly, t.' (Yk) cannot contain edge e, and so g does not contain e.

There is a natural notion of convergence for locally finite sets x,,, x. We define x. ~ x to mean: we can label x as Xl, X2.... and x,, as Xn, 1, X~, 2 .... such that as n > oo

X,,,i+xj, 0:~i

for each L such that x has no point on the boundary Of CL L, L]`.

Now suppose h. and h are graphs with respective vertexsets Xn and x. Define convergence of graphs h,, > h to mean: x,,* x, and, for each L as above, for all n ~t some no (L), we have the two properties

if (x,,, j, x,,, j) is an edge of hn with X,, i E CL then (xi, xj) is an edge of h (4) and

if (xi, xj) is an edge of h with Xi C CL then (x, j, x, j) is an edge of h,, . (5)

Note from this definition that convergence h,, ). h and x, j ). xi implies that the degree d(Xn,i) of x,,,i in h,, converges to the degree d(xi) of xi in h.

Recall the trees tk(x, x) defined in the introduction. Define a graph gk(x) by: (xi, xj) is an edge in 9k iff it is an edge in either tk (Xi, X) or tk (Xj, X). It is clear that, for nice sets x,, and x such that x is infinite,

if Xn > X, Xn > X, then tk (X., X.) tk (X, X) for each k and hence, for fixed k,

if x,, > x then 9k (X,,) and 9k (X) satisfy (5) (6)

i

i i

i i i

Random minirnal spanning trees 251

(Of course if x,, is finite then tk (X,, x.) and 9k (x.) are defined only for k x, ~, but the convergence assertions above still make sense.)

Recall the definition in Lemma 1 of g(x) for an infinite nice set x. To make our

notation consistent, write g(x) for the MST in the finite case. Returning to the

infinite case,' by definition g(x) = u,g,(x), and so from (6) we obtain a semicontinu

ity result for the map x > g (x):

if x, > x and x is infinite then g(x.) and g(x) satisfy (5)

In general we do not have continuity. As an example , write s, lli and

consider the 1dimensional set

Xn ys,~, ysi, S1, ' sn (7)

For a suitable irrational y > 0 this defines a nice set. Then ( y, 1) is an edge in each g(x.) but not in g(x.).

However, if x,,+ x and (5) holds, then to establish (4) it clearly suffices to cheek that the number of edges with some end in CL converges to the correct limit as n .> oo. So we may summarize the discussion as follows.

Lemma 6 Suppose x,, > x, where x,, and x are nice and x is infinite.

(a) If Xn C Xn and x c x are such that x. x then

lim inf d(xJ d(x) n

where d (.) denotes the degree of the vertex in g(xJ or g(x).

(b) For each L such that x has no point on the boundary Of CL,

liminf 1 d(Xn,i)~: ~7, d(xi).

n Xn, i C CL Xi C CL

(c) Suppose that, for each L such that x has no point on the boundary Of CL,

1 d(x., j) Y, d(xi) as n , oo

X~,1 C CL Xi C CL

Then g (x.) g (x).

A (simple) point process is just a random locally finite set of points in R The notion of convergence in distribution (i.e. weak convergence) of point processes is discussed in e.g. Daley and VereJones [6] Sect. 9.1. It is straightforward to show (c.L [6] exercise 9.1.6) that their notion of weak convergence is just the general notion of weak convergence on a metric space, applied to a metrization of the convergence of locally finite sets defined at (2, 3). We are mostly concerned with consequences of convergence in distribution, and these are most easily obtained by using the Skorohod representation theorem ([7] Theorem 3.1.8). That theorem says

d

,V,, X,,,, iff we can choose versions X,', of X,, (n 1, 2,. . oo)

such that a.s. (8)

2' 252 D. Aldous and J.M. Steele

Saying X;, is a version of iV, means they have the same distribution. And the almost sure convergence is understood in terms of convergence of deterministic locally finite sets in (2, 3). Similarly to (8), the previous definition of convergence of deterministic graphs can be used to define convergence in distribution of random graphs.

For our final technical lemma, let X be a nice stationary point process, with intensity 0 < p < cc. For ~ e X let d (~) be the degree of ~ in g (X). For each x e R d let A/" have the Palm distribution, i.e. the conditional distribution of A' given there exists a point of X at x. Let D have the distribution of d(x), the degree of x in g(,AI"), and note that by stationarity this distribution does not depend on x.

Lemma 7 If X is a nice stationary ergodic point process and D is de ned as above, then ED = 2.

Proof. Recall CL L, L]'. For fixed L an elementary argument gives the following deterministic identity, for realizations with no point on the boundary Of CL.

(d(~) 2) = BL 2FL (9)

E CL n r

where BL is the number of edges of g(,f) with one endpoint inside CL and the other endpoint outside CL, and where FL is the number of components of the forest on If n CL obtained by taking as edges those edges of g(,f) which join two points in CL.

Now

E 1 CL n AI' p (2L)"

and, using the fundamental property of Palm distributions,

E E d(~) = (2L)d pED.

~ e X n CL

Since FL < BL, (9) Will imply the lemma if we show that EBL = o(Ld). An elementary argument (suggested by the referee) goes as follows. Let D, (x) be the number of edges in g (,V') of the form (x, xi) with 1 xi x 1 ~: r. Then ED, (x) does not depend on x. Now BL counts edges crossing into CL, so by considering separately those edge with endpoint in CL \ CL , and those with endpoint in CL ,., and applying Lemma 4 to the former,

EBL < p bi 1 CL\CL + p (2 (L r))d ED, (0)

SO lim SUPL,,, EBL I(M)d < pED,.(0), and letting r > oo the bound tends to 0.

3 Convergence of MSTs

As in Sect. 1, let X,, denote the point process consisting of n points (qi: 1 :5 i :!~ n) which are independent and uniform on the unit cube [0, 1]'. For each n let

Xn* = {n I'd(ni _ rll): 1 :5: i :!~ n} .

Random minimal spanning trees

253

Thus, X n* is the scatter of n points, centered at one of those points and rescaled to have intensity 1 on the rescaled cube. Let A` = X u {0}, where X is the Poisson process of intensity 1. Equivalently, one could say that XO has the Palm distribution of X, given that it contains a point at 0.

d

Lemma 8 X0 as n oo.

Proof. This is one of those "obvious" facts for which it is hard to cite a reference. In brief, given that n, is not within Ln 11d of the boundary of the unit cube, 1 X,* n CL \ {0} 1 has Binomial (n 1, (2L)d In) distribution, and the points of Xn* n C,\{0} are i.i.d. uniform on CL. And of course 1 XO n CL\{0} 1 has Poisson ((2L)d ) distribution, and the points of A` n CL\{0} are i.i.d. uniform on CL. Then the assertion of the Lemma follows from Binomial convergence to Poisson.

Write g(.IV*) for the MST on X*, so g(X*) is just a rescaling of the MST Yn

n n

on X,.

d

Proposition 9 (a) g(X*)* g(A").

(b) Let Dn,, be the degree of nj in Yn, and let D be the degree of 0 in g(,f 0). Then

d

D,,> D.

Proof. We shall first show

lim ED,, n :!~ ED = 2 .

We know ED = 2 by Lemma 7. Any tree on n vertices has exactly n 1 edges, so the average degree of the vertices is 2 2/n. It follows that EDn = 2 2/n, by exchangeability of (ql, . . . , q,,). This establishes (10).

The reader with a lot of intuition about weak convergence and point processes will now see that the Proposition follows from Lemma 8 and the deterministic convergence facts in Lemma 6. To give the details, we start with an integration fact.

Lemma 10 If nonnegative r.v.'s Y1, Y2, . , Y satisfy

lim inf Yn ~t Y a.s., E Y, E Y < oo n

L1

then Yn). Y and hence in distribution.

Proof. Write

Yn Y yn Y) + Y. Y)

By hypotheses and dominated convergence E(Yn Y) 0. Then using the second hypothesis and (11) we see E(Yn Y)'0. So E 1 Y,, Y1

E(Y. Y)+ + E(Y. Y) , 0.

The point is that Lemma 8 and the Skorokhod representation theorem (8) allow us to use versions of X.* and XO such that X*+ XO a.s., where we suppress the n

s, and it suffices to prove the assertions of the Proposition for these versions. Now Lemma 6 (a) implies that for these versions lim infn D,,, D a.s.. In view of (10) we

254 D. Aldous and J.M. Steele

d

can apply Lemma 10 to show D,, D and hence D,,, > D, establishing assertion

(b) of the Proposition,

To work toward part (a), fix L and define

Y~ d(~) dti (qi) 1(q, q, c CL

~ c n CL

where dj.) means "degree in Using exchangeability of (ql, . . qJ,

EY, = E djqJ1 E 1 V* n CL

qi C CD, n

Defining

Y d

~ c A` n CL

a similar argument using independence properties of the Poisson process gives

EY= ElX' n CL~D

As before, take versions of the point processes which converge a.s., so

1 yn* n CL X' n CLa.s.

L,

With these versions we already proved that D, D, so using the degree bound

in Lemma 4 we see

Ll

1Yn* (_) CL I D, 1 Y' n CL 1 D

and hence EYn),EY. But Lemma 6(b) implies that these versions satisfy

L,

Y > Y. So far we

lim inf, Y,, ~: Y a.s., so Lemma 10 shows these versions satisfy n have assumed that L is fixed. But now we can argue that every increasing sequence of integers n contains a subsequence (nn) for which

1 d(~) Y', d(~) a.s., for each rational L . (12)

E. V*.~ n CL C Ir' n CL

A realization sequence satisfying (12) must satisfy the hypothesis of Lemma 6(c), and hence satisfy its conclusion g(.A,n*) > g(X0). This "subsequence argument"

d

implies that the versions satisfy g(X*)+ g(X') as n oo through all integers. n This completes the proof of Proposition 9.

Note that part (b) of Proposition 3 is the assertion P(Dn,n = i)+ P(D = i), so this is established by (b) above. Of the results stated in Sect. 1, it remains to prove Lemma 2(c) and Proposition 3(a). Write (Lj(iIi); j = 1, 2,. . . ) for the lengths of the edges of Y, at tli. Then

n' n 1 d

Q. leil" Y EL n,j(tli)

i =12 J.

Random minimal spanning trees 255

and so

n d

EQ,, = EILII,j(q,

2 j

Proposition 9 implies

d

(nIld L, j (q, j (Lj; j > 1)

and then the fact that EQn ~> ld < CC will follow from part (a) of the Lemma below.

d Lemma 11 (a) {nyjL,,j(ql); n > 1} is uniformly integrable.

(b) {nE~',ilei Ild ; n ~: 1} is bounded.

Proof. Fix x, y e [0, 1]' and define

r = r(x, y) = lx yl

A(x, y) = {Z C [0, qd: lz xl < r, lz yl < r}

It is easy to see volume (A (x, y)) > ad r d

where ad is a constant. If (x, y) is an edge in the MST on some subset x C LO, 1]d, then A(x, y) cannot contain any points of x, by considering the greedy algorithm. Applied to V,, we obtain

P((x, y) is an edge of Y,, lx, y e (1 adr d)"2

<= exp ( (n 2) ad rd) .

Now let x and r be given. Conditioning on q, = x and integrating over possible locations y of other points with 1 x y 1 e [r, r + dr],

IP(L.j(qje[r, r + dr]):~ exp( (n 2)adr d )(n 1)Sdrd 1 dr (13)

j

where Sd is the surface area of the unit sphere in Rd. Setting 1 = nr d , a little algebra gives

EnL d,j(ql )I(nL d' j(q 1 ) C [1, 1 + dl]) :5 exp((1 21n)adl)(1 11n)d 'Sdl dl

n n

j

<= exp((113)adl)dlsdldl, n > 3 .

The bound has finite integral over 0 < 1 < oo, and so for fixed j the sequence (nL','j(qj); n ~~t 1) is uniformly integrable. Then (a) follows using the degree bound in Lemma 4. Similarly, setting u = n'r 1d in (13) leads to

Id.(q d,

57,En'L.J 1)I(n'L,',,(ql)c[u,u+du])

<= exp(Q 2/n)adUl /2 )(1 lln)(2d)1Sd U1/2 du

< exp( (113)adUl 12 )(2d)`Sdu'I'du, n > 3 .

256 D. Aldous and J.M. Steele

Again the bound has finite integral over 0 < u < oo, and so the sequence n 2 E1jL 2d. is bounded in n. But by exchangeability of

ni

(til . .1n.),n 2 E)7,jLn".(nj is twice the quantity in part (b) of the Lemma.

,j

To complete the proof of Proposition 3(a), it is enough to show that

var(Q,) > 0. By Q13], p. 1778 top),

n 2

var(Q, 1) ~~ E 21 (qi) + 2 2 dE Ld,

min L,, j jK)

j i=1j .

Using Schwarz's inequality and Lemma 4, the final (.)2 term is bounded by

2d.(qi), SO

bdIjLn,i

var(Q,, 1) (1 + bd 2 1d )E L 2 d.

n,i L

j 1

2Q + bd2 2d )E 1 ei Ild

This last expression is 0 (l/n) by Lemma 11 (b).

4 Greedy algorithms and MST

Here is an alternate description of the graph g(x) defined in Lemma 1.

Lemma 12 Let x be a nice infinite set. Let cl be the graph on vertices x containing as edges all pairs (xi, xj) with 1 xj xi 1 < 1. Define g*(x) by: (xi, xj) is an edge of g* (x) if xi and xj are in different components of cl.,j , 1 and at least one of these components is finite.

Then g*(x) = g(x).

In example (7) the edge ( y, 1) is not in g (x), although the endpoints 7 and 1 are in different (infinite) components of c, +,. This shows that the condition "at least one of these components is finite" is needed.

Proof. IRX1, X2)is an edge of g(x) then by definition it is an edge of t,,,,(xl, x), say,

and so is an edge of t,, (x 1, x) for some minimal n. So x 1 and X2 are in different

components of and the component containing xl has exactly n1

vertices. The converse argument is identical.

For completeness, let us discuss briefly the connection between our definitions, aimed at the infinite case, and the usual greedy algorithm for constructing a Euclidean MST on a finite set x. This is a specialization of Kruskal's algorithm (e.g. Chartrand and Lesniak [5] p. 71) for constructing a minimumweight spanning tree in a finite graph with positive edgeweights (thus the geometry of R d and even the triangle inequality for distances, are irrelevant).

The greedy algorithm. Construct graphs fk, 1 :5 k < n x l on vertexset x by

fl has no edges

fk isfk 1 plus an extra edge (xi, xj) chosen such that 1 xj xi 1 is minimal over all possible edges with endpoints in different components of fk 1

Then f, = t(x), say, is the MST.

Random minimal spanning trees 257

n

For finite x, it is easy to see that ~(x) is the same as the tree t'(x, x) defined in the introduction, and is the same as the graph g*(x) defined in Lemma 12. But where x is a nice infinite set, the greedy algorithm will typically not make sense, because the interpoint distances 1 xj xi 1 will typically form a dense set. The construction of Lemma 1 seems the simplest "algorithmic" way to define the analog of the MST on an infinite set.

5 A conjecture

For the Poisson process A` it seems natural to make

Conjecture 13 (a) ~7 = g(X0) is a.s. a single tree.

(b) W contains no doublyinfinite path . . . , x 1, xo, x 1, . with distinct vertices.

Remarks 1. Studying this conjecture soon leads one to considerations of continuum percolation. Indeed, the connection between minimal spanning trees on random points and continuum percolation has been used in a statistics context by

Hartigan [9] and in an unpublished thesis of Rarney [12], and has been noted by percolation theorists (Kesten [10], Sect. 2.4). See Grimmett [8] Sect. 10.5 for a recent account of continuum percolation. The "uniqueness of infinite clusters" property implies that for Poisson processes the condition "at least one of these components is finite" in Lemma 12 is irrelevant: in other words, the behavior of Example (7) cannot occur. But another possible type of "undesirable behavior" is as follows. Consider the example in R'

X = {A si), (vi, si);j > 1}

i

where sj j = 1 lli, and vj 12 is chosen so that x is nice. Here g (x) consists of two trees t, and t2, where t, has vertices {(0, sj);j > 1}. The reason g(x) is a forest rather than a single tree is that the infirnum. infx, c t,,x, c t, 1 X2 xl 1 = 2 is not attained. Proving Conjecture 13(a) is tantamount to showing this behavior cannot happen with a Poisson process.

2. If the Conjecture is true, it makes sense to define L* as the length of the first edge in the unique infinite path in W starting at 0. It is then easy to see that Proposition 3(a) holds with ld = EL' '

3. For many classes of combinatorial trees on n vertices, there is a result: as n > oo the tree, viewed from a uniform random vertex, converges to a limit infinite tree satisfying (b) above. See Aldous [1] for a survey of such results. Conjecture 13 would imply that the same type of result holds for Euclidean MSTs on n random points.

4. Pernantle [11] gives interesting results on the analog of Conjecture 13 for a quite different model of ddimensional random trees (uniform random spanning trees of the lattice). In that model, the answers are different for d :5 4 and for d > 5.

Acknowledgement. We thank the referee for carefully reading and pointing out some obscurities in the original draft.

258 D. Aldous and J.M. Steele

References

1. Aldous, DI: Asymptotic fringe distributions for general families of random trees. Arm. Appl. Probab. 1, 228266 (1991)

2, Aldous, D1: A random tree model associated with random graphs. Random Structures and Algorithms 1, 383402 (1990)

3. Avram, F., Bertsimas, M The minimum spanning tree constant in geometric probability and under the independent model: a unified approach. Technical Report, M.I.T., 1990. Arm. Appl. Probab. (to appear)

4. Beardwood, L, Halton, HI, Hammersley, JM: The shortest path through many points. Proc. Cambridge Philos. Soc. 55, 299327 (1959)

5. Chartrand, G,, Lesniak, L.: Graphs and diagraphs. 2nd edil. Belmont, CA: Wadsworth 1986

6. Daley, D.J. VereJones, D.: An introduction to the theory of point processes. Berlin Heidelberg New York: Springer 1988

7. Ethier, S.N., Kurtz, T.G.: Markov processes: characterization and convergence. New York: Wiley 1986

8. Grimmett, G.R.: Percolation. Berlin Heidelberg New York: Springer 1989

9. Hartigan, LA.: Consistency of single linkage for highdensity clusters. J. Am. Stat. Assoc. 76, 388394(1981)

10. Kesten, H.: Percolation theory and firstpassage percolation. Arm. Probab. 15, 12311271 (1987)

11. Pernantle, R.: Choosing a spanning tree for the integer lattice uniformly. Technical Report, Cornell University, 1989. Arm. Probab. 19, 15591574 (1991)

12. Ramey, D.B.: A nonparametric test of bimodality with applications to cluster analysis. PhD thesis, Yale University, 1982

13. Steele, J.M.: Growth rates of Euelidean minimal spanning trees with power weighted edges. Arm. Probab. 16, 17671787 (1988)

14. Steele, J.M., Shepp, L.A., Eddy, WX.: On the number of leaves of a Euelidean spanning tree. J. Appl. Probab. 24, 809826 (1987)

15. Tirnofeev, E.A.: On finding the expected length of a random minimal tree. Theory Probab. Appl. 33, 361365 (1988)

i

i i i

i