Please see PDF version

J. Appl. Prob. 24, 809826 (1987)
Printed in Israel
0 Applied Probability Trust 1987

ON THE NUMBER OF LEAVES OF A
EUCLIDEAN MINIMAL SPANNING TREE

J. MICHAEL STEELE,* Princeton University
LAWRENCE A. SHEPP,** AT& T Bell Laboratories
WILLIAM F. EDDY,' CarnegieMellon University

Abstract

Let Vk,, be the number of vertices of degree k in the Euclidean minimal spanning tree of Xi, 1 :_5 i :5 n, where the X, are independent, absolutely continuous random variables with values in R`. It is proved that n'Vk,, converges with probability 1 to a constant ak,d. Intermediate results provide information about how the vertex degrees of a minimal spanning tree change as points are added or deleted, about the decomposition of minimal spanning trees into probabilistically similar trees, and about the mean and variance of

Vk,

SUBADDITIVE EUCLIDEAN FUNCTIONALS; VERTEX DEGREES; SELFSIMILAR PROCESSES; EFRONSTEIN INEQUALITY

1. Introduction

The initial purpose of this investigation was to provide an asymptotic understanding of the number of leaves of the minimal spanning tree of n points chosen at random from the unit square. To set this problem precisely, as well as to make clear the more general results which will be established, we begin with some notation.
By G = (V, E) we denote a connected graph with vertex set Vand edge set E. We also assume that there is a weight function w: E R which assigns a real number to each edge in E. A minimalspanning tree Tof G is a connected graph with vertex set V and edge set E' c E such that
1 w(e) = w(T)
CEE'

Received 9 May 1984; revision received 2 October 1986.
* Postal address: Program in Engineering Statistics, Princeton University, Princeton, NJ 08544, USA.
Research supported in part by National Science Foundation Grant DMS8414069. Postal address: AT&T Bell Laboratories, Murray Hill, NJ 07974, USA.
Postal address: Department of Statistics, CarnegieMellon University, Pittsburgh, PA 15213, USA.

809

810 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

is minimal, i.e. no connected subgraph of G has a smaller total edge weight than T = (V, E).
One probabilistic model of interest here is specified by taking V
(X,, X2,. . ., Xj where Xi, 1 :5 i :5 n, are independent random variables with the uniform distribution [0, 112, letting E consist of all (") pairs (i, j), 1 :5 i < 2
j :_5 n, and setting w (xi, xj) xi xj where the bars denote the usual Euclidean distance. We denote by MST(Xl, X2,. . ., Xn) a minimal spanning tree of X,, X2,. . ., X, under the weight function w. As this language acknowledges, there can be more than one minimal spanning tree for a given sample {xl, x2,. . ., x.}. However, for Xi with continuous support, the minimal spanning tree is unique with probability 1, since any two sets of n 1 edges will have different total lengths with probability 1.
Our main focus will be on the number of leaves (i.e., vertices of degree 1) in the minimal spanning tree determined by the uniform sample model. We also consider the more general problem of general vertex degrees of random vertices Xi, 1 :_5 i :_5 n, which are chosen according to an arbitrary density with support in R, d k 2. Still, the combinatorial and probabilistic essence of our problem is already to be found in the case of uniformly distributed random variables in the unit square.
The motivation for our investigation comes from several sources. First, there has already been interesting work done on the number ofleaves of a random tree where the random tree is less intricate than the Euclidean minimal spanning tree. Second, there is a close connection with previous work on subadditive Euclidean functionals. After introducing the notation for our more general results, it will be possible to give an explicit application of our main theorem to the theory of multivariate nonparametric hypothesis tests.
The earliest work on the number of leaves of a random tree is due to R~nyi (1959), where the specified model was that of choosing a tree Tat random uniformly from the set of all n` labeled trees with n vertices. If L,, denotes the number of leaves of such a tree, R6nyi (1959) established that

(1. 1) EL, nle as n oo

and also that L,, is asymptotically normal with approximate variance n(e 2)e1.
Najock and Heyde (1982) also established a linear growth rate for the expected number of leaves in the context of a certain class of (n 1)! rooted trees which arise in a linguistic application. Under their model for a random tree, they proved that

(1.2) lim nIELn
n_00

In contrast to the tradition of R6nyi (1959) and Najock and Heyde (1982), the investigation of minimal spanning trees has a strong geometric flavor. Also,

On the number of leaves of a Eudidean minimal spanning tree 811

the successful analytical treatment of such Euclidean trees rests on articulating an approximate selfsimilarity between a subset of the tree and the whole tree. This method is closely related to subadditive techniques for Euclidean functionals, and the present application to the number of leaves also extends the domain of problems to which subadditive methods are effective. Earlier work with the type of subadditive method used here always seemed to deal with the lengths determined by optimality conditions. (See e.g. Beardwood et al. (1959), Steele (1981a,b), or Hochbaum and Steele (1982)). For some understanding of the applications of minimal spanning trees in computer science, one can consult Bentley (1978), Chin (1978), Jung (1974), Kang (1977),, Kataiainen (1983), and Whitney (1972). For some examples of applications of minimal spanning trees in physical sciences and biology, one can consult Mallion (1975), Penny (1980), Romane (1977), and Wu (1977).
Three main results will be proved here, but the following readily digested theorem motivates and underlies the whole development.

Theorem 1. If Xi, 1 :_5 i < cc, are independent and uniformly distributed on [0, 1 f, then for L,, the number of leaves of the MST(Xl, X2,. X.), we have with probability 1,

(1.3) lim n'L, = a> 0.
n co

The exact value of the constant a is not known, but Monte Carlo simulation results suggest that a = 2/9 is a reasonable approximation. We will provide a proof that a > 0, but the only analytical bounds we can provide on a are very conservative. Further comments on the possibility that a exactly equals 2/9 are given in Section 6.
The conclusion of Theorem 1 can be extended to arbitrary dimension and arbitrary vertex degree. Naturally, the associated constants depend upon the dimension d and on the vertex degree.

Theorem 2. Let X,, 1 :5 i < cc, be independent and uniformly distributed on [0, 11', d ~~: 2. If Vk, = Vk(Xl, X2,. . ., Xj denotes the number of vertices of degree k in the MST(Xl, X2, XJ, then there are constants ak,d such that with probability 1

(1.4) lim n 'Vk, = ak,d n

for all k.

In the case of k = 1 and d 2, one has V1,n = L, and the constant al,2 is simply a, the constant of Theorem 1. One should note also that for each dimension d there is an integer Dd such that Dd is the largest possible degree of any vertex of any minimal spanning tree in R d SO that ak,d = 0 for k > Dd. It is easy to see that Dd < I/' for any d, but only a few values of Dd are known

1

812 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

exactly. For proofs that D2 = 6 and D3 = 13, one can consult Hilbert and CohnVossen (1952), p. 47, or Rogers (1964).
We are able to show a 1,d > 0 for all d 2: 2; but, since even the determination of Dd is open for large d, it is unreasonable to expect any determination Of ak,d for large k and d.
The third theorem shows that one can relax the assumption that the Xi, 1 ::5 i :_5 n, are uniformly distributed. Curiously, it does not seem possible to extend our results to completely arbitrary distributions. In particular, considerable mystery remains about the number of leaves of minimal spanning tree samples from a distribution with singular support. The difficulty of dealing with the contributions due to a singular component of the distribution of the X, is peculiar to the problem of vertex degrees and has not arisen in the earlier work on subadditive Euclidean functionals.
One interesting aspect of the limit theory of Vk, for absolutely continuous random variables with compact support is that the limit of n 'Vk, does not depend on the underlying distribution. In particular, we have the following result.

d
Theorem 3. If Xi, 1 :_5 i < oc are i.i.d. with a density f in R ' then with probability 1

(1.5) lim n t Vk, = ak,d for all k ~: 1, d 2!: 2.
n00

Before digging into the proof of these results, it seems worth giving more details on application of Theorem 3 to the theory of nonparametric multivariate tests. In particular, we consider the work of Friedman and Rafisky ( 19 79) which gives an elegant extension of the WaldWolfowitz runs test to multivariate samples by using the minimal spanning tree to suggest a proper analogue for the number of runs in a sequence. If one is given two samples of sizes n and
d
m from two distributions F, and FY in R ' one can test HO :F,= F, by constructing the minimal spanning tree of the joint sample, removing all the edges of the tree which join observations from different samples, and letting R denote the resulting number of subtrees. Small values of R would lead to the rejection of HO. The expectation of R does not depend on the degrees, and equals 1 + 2nm 1(n + m) just as in the classical test of Wald and Wolfowitz. In the general case, Friedman and Rafsky found the variance of R conditionally on C, the number of edge pairs that share a common node. If we let di be the degree of the ith vertex of the MST of the joint sample, then since ~ 2:,~+,m di
n + m 1 we have under HO that

1 n+m
(1.6) C = E d,(d, 1) = 1 n m + E k' V,, lm,k.
2 j1 2 k

Consequently, the results given here show that the conditional tests of

On the number of leaves ofa Eudidean minimal spanning tree 813

Friedman and Rafsky are asymptotically unconditional under HO. The complete details of this fact would take us away from the main issues, still Equations (1.5) and (1.6) should make the claim credible.
The proof of preceding theorems will be given in the next four sections. In Section 2 we collect some basic combinatorial observations. Section 3 proceeds to obtain the behavior of the expectations EVk, by using a Poisson embedding, subadditivity arguments, and the application of a Tauberian theorem.
The fourth section uses the EfronStein inequality to obtain a good bound on the variances of Vk,. Traditional Chebyshev and interpolation techniques are then used to complete the proof of Theorem 1 and Theorem 2. Section 5 shows how a lemma due to Strassen (1965) can be used to extend Theorem 2 first to densities with compact support, then to general densities.
Section 6 collects the information which is available on the constants ak,d The final section reviews some open problems and provides perspective on the way the asymptotic theory of vertex degrees fits into the theory of subadditive functionals and relates to the idea of fractals.

2. Combinatorial observations

The next four lemmas are free of any probability theory. They focus on the basic combinatorial geometry of minimal spanning trees.

Lemma 2. 1. If B is a subset of S = {x,,x2,. . ., xJ, and if e = {Xi, Xj} is an edge of MST(S) such that e c B, then e E MST(B).

Proof. Since eE=MST(S), there is a partition (A,Ac) of S such that e nA 00,e nAc=AO and

(2.1) 1 e 1 = min{ 1 e'l: el n A :~6 0 and e, n A' =,,' 0

(This is a traditional necessary and sufficient condition for an edge to be in a
MST.)
Since e C: B, we see A n B and A c n B is a nontrival partition of B. Because of (2. 1) the edge e will then satisfy

(2.2) 1 e 1 5 min{ 1 e'l: e' n A n B =A 0 and el n A, n B =,.,= 0}.

By the cited necessary and sufficient condition for an edge to be in the minimal spanning tree of B, we see that (2.2) guarantees e EE MST(B).

One important consequence of Lemma 2.1 is that it shows that only a few of the elements Of (Xl, X2,' . ', Xn} can have different degrees in the MST(xi, X2,' Xn, Xn + I) than they have in the smaller tree MST(xl, X2,' x,). This is made quantitative as follows.

d
Lemma 2.2. For any xi Ell R ' 1 :_5 i :5 n + 1, if w denotes the number of

1

814 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

elements Of {XI,X2,''',X,11}, which have different degrees in MST(xl, X2,' . ., x,,, x, , 1) and MST(xl, x2,. xJ, then w :5 4Dd 2.

Proof. We apply Lemma 2.1 where

B = {Xl, X2,'. .' Xn} C {Xl, X2, Xn + 1} ~ S

and observe that every edge of MST(S) which is not an edge of MST(B) must be incident to x, + 1. There are at most Dd such edges.
Since MST(S) has n edges and MST(B) has n 1 edges, we see therefore that there are at most 2Dd 1 edges of MST(S) and MST(B) which are not common to both MST(S) and MST(B). Since the edges which fail to be in MST(S) or MST(B) are incident to at most 4Dd 2 vertices, there are at most 4Dd 2 vertices which fail to have the same degree in MST(B) and MST(S).

One way in which Lemma 2.2 will be used is expressed in the following corollary.

Corollary. If {xl, x2,. x,,} and {x', x',. x' } are any two finite sub
1 2 m
sets of Rd, and if h is the cardinality of the symmetric difference of these sets, then there are at most 2h(4Dd 2) vertices which have different degrees in MST{xl, x2,. . ., x,,} and MST{x't, x2',. . , xm'}.

As given above, the proof of the corollary is immediate. With more thought, one can provide a bound of h (4Dd 1), but the essential point is the linearity in h.
Although we are concerned with the degrees of vertices, we will find it useful to have some information about the length of minimal spanning trees.

Lemma 2.3. There is a constant cl such that for any xi EE [0, Ild, 1 :5 i:5 n, there exists an i and j, 1 :5 i < j :_5 n, such that

(2.3) 1 xi xj 1 :_5 cIn 11d.

Proof. This is a consequence of a geometric pigeonhole argument. Suppose each of the points xi were the center of a ball of radius r. Such balls have volume codrd, where o)d is the volume of the unit d ball. If xi EE [0, 11 d, then the ball centered at xi must cover at least 2 dcod r d of [0, 1 ld; SO, if

(2.4) n 2 ~ d60drd > 1,

then two rballs must intersect. Thus, if there are n points in [0, 1 ] d' We see that some pair of points must be within a distance of 2r if inequality (2.4) holds. In other words, if there are n points in [0, 1 ],', we see that some pair of points must be closer than 4co,7 11dn lld, so the lemma is established with cl = 4C0,7 'd.

One easy consequence of the preceding lemma is that the length of a

On the number of leaves of a Eudidean minimal spanning tree 815

minimal spanning tree of n points in [0, 1]" cannot be too large. More precisely, one has the following result.

Lemma2A. For any n points {x,x2...,xJ c [0, 1]11, the minimal spanning tree has total length majorized by C2n (d 1)1d.

Proof. Let % be the maximum length of any minimal spanning tree of n points in [0, Ild , and consider a set {xl, x2,. . ., xn , 1} of size n + 1 for which m,+, is attained. By Lemma 2.3 there are xi and xj, 1 :5 i
1d
(2.5) Mn+l:5%+cIn1

By summing (2.5) and using the fact that m, = 0, we see

n1
< Ild (d 1)1d,
(2.6) c, E k = k1 which establishes the lemma with c2 = 2c,

3. Asymptoties of EVk,n

Instead of working directly with uniformly distributed random variables in [0, Ild, we will draw strength from the independence properties of the Poisson process in R`. The first observation is that if we let 0(t) denote the expected number of vertices of degree k of the MST of the points of the Poisson process in [0, tld, then by the usual formula for expectations in terms of conditional expectations,

(3.1) 0(t) (EVk,)t` exp( td )In!
n=0

where Vk, are exactly as defined before, i.e., Vk,n is the number of vertices of degree k in the MST(Xl, X2,. Xn) where Xi are independent uniform random variables on [0, 111.
To obtain asymptotic information about 0(t), we will first get an approximate recurrence relation for the function 4)(t). To aim toward that goal, we partition [0, t ]'into M = in' subcubes Qi of side tIm. We will let Q, denote the points of the Poisson process which are in Qi. By Lemma 2.1 we see that if e E= MST( UY,. 1 Q1') and e c Q,~, then e EE MST(Q1'). We therefore have the set inclusion

m m
(3.2) MST U Q) c U MST(Q1') U D

816 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

where D is the set of all edges of MST( Uly~, Q,) which intersect two different subcubes Qi and Q.
We now move toward showing that the number of vertices of degree k in MST( Q', Q) is majorized by the number of vertices of degree k in U m
j = 1 MST(Q,) plus a small error term. Since B Q' 1 MST(Q,) has fewer edges than A = MST( U,t 1 Q1), Equation (3.2) implies that the difference set Of Ulm= 1 MST(Q) and MST( U't, Q,) contains at most 2 1 D 1 edges, where 1 D 1 is the number of edges in D. This follows from the observations that A B = D and IB A 1 = 1 B n A'I = IB 1 IB n A 1 ::5 JA 1 1 B n A
IDI.
The number of vertices of degree k in MST( U,~' 1 Q) is bounded by the number of vertices of degree k in U,!' 1 MST(Q~) plus 4 1 D 1, since for a vertex degree to change, it must be incident to one of the edges in AL~,B, there are at most 2 1 D 1 such edges, and each edge is incident to two vertices. Taking expectations and summarizing in terms of 0, we have

(3.3) p(t):5 m"0(tlm) + 4E 1 D 1.

For us this will be an important inequality, but for it to be effective we need a
bound on E 1 D 1. We fix a real parameter 0 < A < t and consider the decompo
sition Qi = Ai U Bi, where Bi is the set of all points within A 1m of the boundary
of Qi and Ai = &. We note that U ' y 1 Bi has volume equal to p = t`
Md(t1M 2Alm )d = td (t 2A )d, and, hence, the expected number of
points of the Poisson process in Qt 1 B, is also p. Further, we note that if i :0
then for any x EE Ai and y EE Aj we have 1 x y 1 k 2Am 1.
The relations for bounding E 1 D 1 are now all in place. Any edge in D must either have at least one vertex in Qt 1 Bi or else have length at least 11m ` in order to span the distance between two distinct Ai's.
By N(x) we denote the number of edges of MST(U,y 1 Q~) of length at least x, and we note that xN(x) is bounded by the total length of the MST(Ult, Q~). By Lemma 2.4 we seethe expected length of MST(Qt, Q~)is bounded by c2EZ(` 1)11, where Z is a Poisson random variable with mean td. By Jensen's inequality EZ(d ~ 1 Ild is maj orized by td 1, so we have

(3.4) EN(x) :_5 C2 tdIX1.

We let T denote the set of points of the Poisson process which are in U Y
1 1 Bi, and note that every edge of D must either meet T or have length at least Ulm. Since no vertex of a minimal spanning tree meets more than Dd vertices, we have a basic bound on E 1 D 1,
E 1 D 1 < DdE 1 T 1 + EN(2.Z/m) < DdMd{(tIM)d (t1M 2Alm)d}

+ C2 td1(2A)1m.

Now, by simply letting A = 1, we see there is a constant c3 such that

1 1

On the number of leaves of a Eudidean minimal spanning tree 817

(3.5) E j D 1 :5 c3t" 'm for all t > 1.

Here we should note that there are choices of A which lead to sharper inequalities than (3.4), e.g. A = tIm ', but inequality (3.5) is quite simple and sharp enough. Combining the bound (3.5) with (3.3), we see that for all integers m > 1 and real t > 1, we have for c4 = 4C3 that

0(t):5 Mdo(tIM) + C4tdIM.
(3.6)

A key step in obtaining the asymptotics of EVk, will be to show that any continuous function satisfying (3.6) must satisfy the asymptotic relationship

(3.7) p(t) _ ytd

for some constant y.
We now prove the relation (3.7). If we let V1(t) = (p(t)td, then (3.6) is equivalent to

(3.8) VI(Mt):~5 Y1W + C4t'.

Just by setting t = 2 and letting m oo, we see

(3.9) 0 :5 y = lim inf Vi(t) < Vi(2) + c4 < cc.
t00

Now, for any e > 0 we can find an open interval (a, b) such that Vi(t) < y + e for all t CE (a, b), and we can also choose the value a sufficiently large so that c4a 1 < e. By (3.8) this means

(3.10) Vi(mt):_5 y + 2e

for all positive integers m and all t EE (a, b). If we let I,, = (ma, mb), we see that (3. 10) says Vi(t) :5 y + 2e for all t CE I,,. Now, since I,, + 1 and I,, intersect for all m such that (m + 1)a mb, we see U'., I contains the infinite interval
m
[to, ce), with to = a ra (b a) 'I, so

(3.11) V(t):5 y + 2e for all t i; to.

Since 7 was defined as the limit infirnum of Vi, the fact that e > 0 was arbitrary in inequality (3.11) says that lirn,,, vi(t) = y.
The asymptotic behavior of E Vk, can be extracted from (3.7) by means of a Tauberian argument. An easy Tauberian theorem to apply is the following.

Lemma 3.1 (Schmidt's BorelTauber theorem). If f is any real function defined on the integers and satisfying the Tauberian condition

(3.12) lim lim inf min {f(m) f(n)j > 0
e 0+ n,c n_
then the limiting relationship

818 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

(3.13) exp( A)Af(n)M! c as A oo

implies

f(n) c as n c.

Here, of course, the notation lim,0, g(e) means the limit of g(e) as e 0 for e > 0. This result due to Schmidt (1925) can be found with a modern probabilistic proof in Bingham (198 1), cf. Corollary to Theorem 3a, p. 224.
The asymptotic relationship 0(t) _ ytd as t oo can now be pressed to yield information about EVk, as n cc. In particular, we have the following result.

Lemma 3.2. There are constants ak,d depending only on k and d such that

(3.14) E Vk, ak,dn as n cc.

Proof. If we let x = td in Equation (3.7) and multiply by ex, we have

X Xn
(3.15) E E Vk, yxex.
,0 n!

Integration of an asymptotic series preserves that relation, so integrating (3.15) and dividing by xex gives

(3.16) 1 ex Xn EVk, y.
n 0 (n + 1)!

By taking f(n) = (n + 1) `EVk,n, we will be able to complete the proof of (3.14) by verifying the hypothesis of Lemma 3.1 given by (3,12). If n :_5 m :5
112
n + en ' we have by Lemma 2.2 that

112
Vk,m Vk, 4Djen

so, we also have

f(m) f(n) = (m + 1) `E Vk, (n + 1) `E Vk,n

>= (n + en 112 + 1) 1 {E Vk, 4Dd en 112} (n + 1) `E Vk,n

(3.17) E (Vk,n)(n + en 112 + 1) 1(n + 1) 'en 112

4Dden 112 (n + en 112 + 1) ~ 1.

Since EVk, :_5 n, the last expression in (3.11) shows without effort that

lim inf min { f(m) f(n)) i 0,
n,, n;Smgn+en"2

establishing (3.12). Finally, we can invoke Lemma 3. 1, and thus complete the. proof of the relation (3.14) with ak,d = Y.

On the number of leaves ofa Eudidean minimal spanning tree 819

4. Almost sure behavior

Now that we know E Vk,n ak,dn, we can obtain the almost sure relation Vk,n ak,dn by obtaining information about the variance of Vk,n and then using BorelCantelli and interpolation arguments. First we recall as a lemma the inequality of Efron and Stein (1981), which essentially says that Tukey's jackknife estimate of variance is conservative.

Lemma 4. 1. If h is any symmetric funcion of n 1 variables and we set
hi = h (X,, X2,. Xi 1, Xi + Xn) and h = n 1 2:9P_ 1 hi, then
n
(4.1) Var{h (X,, X2,. Xn E (hi 2.

The EfronStein inequality makes light work of bounding Var Vk,n

Lemma 4.2. There is a constant bd such that for all n > 1,

(4.2) Var Vk,, :5 bdn.

Proof. If we let h denote the number of vertices of degree k in the minimal spanning tree of {X,, X2,. . ., Xn }, then h is certainly symmetric. We also note that to replace h in E,P_ 1 (hi h) 2 by Vk,, will only make the sum larger, so n
(4.3) Var Vk,n 1 < E E (hi K,X.
i1

Now by Lemma 2.2, 1 hi Vk,n 1 :5 4Dd, so (4.3) implies

Var Vk, 1 :5 n 16D' d

2
and this inequality implies (4.2) with bd = 32Dd.
We now have all the tools needed to complete the proof of Theorem 2 (and to obtain Theorem 1 as a corollary). By Lemma 4.2 we see that for m, = n2, we have for any e > 0

m 0
E 'P( 1 Mn'K^ ak,d 1 > 8) :5 E (eM,.) 2(bdMn) < 00;
n=l n~1

so, by the BorelCantelli lemma, we have proved that Vk,m. M, ak,dwith probability 1. Now for any Mn :'S M < Mn + 1, we have m m, :5 3n < 3 m 112 whence by Lemma 2.2 we have

M Vk,m Mn 1 Vk,Pn, (M Mn ') Vk,~ + Mn '(Vk,m Vk^)
< M 112) 1 112) '4DA3m 112).
(m 3m 1 + (m 3m

Letting in x and using the fact that Vk^ M,ak,d therefore implies that
Vk,m M ak,d holds with probability 1. This completes the proof of Theorem 2.

820 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

5. Extension to general densities

The proof of Theorem 2 depended heavily on the assumption that the points were uniformly distributed. In this section we will extend the result to arbitrary densities of R'.
We first treat the case of densities with bounded support, and we begin by constructing a family of densities f J), which we will call thefamily of blocked densities. From the construction of the family of blocked densities, we will see that, under the L' norm, it will be dense in the set of all densities on the unit cube [0, 11 d. We first partition the cube into m' cubical cells {Qi }, each cell of edge length m`. Now, within each cell Qi, we define a centered subcell Qi consisting of all points of Qi which are further than (5 from every face of Qi where ~ is a constant satisfying m ` > 26 > 0. If for some m, (5, and fl, we have (1)f is constant and bounded below by fl > 0 on each Qi, 1 < i < m', and (2)f is 0 on U 1 ji g m, {Qi n Q2, we sayfis a blocked density with parameters m, J, and fl.

Lemma 5. 1. If X,,. X, are i.i.d. with a blocked density, then with probability 1 we have

(5.1) lim n 'Vk, = ak,d.
noo

Before beginning the proof of Lemma 5. 1, it will be useful to collect a few geometric facts. First, if Q and Q' are subcubes of [0, 11 d which have edge length r and which share a common face, then the Pythagorean theorem shows that any two points of Q and Q' are at most a distance apart of r(d + 3),`. Second, if a cube like [0, 11 d is divided into Sd labeled subcells Cj of edge s ', then there is a permutation a: { 1, 2,. . ., sd} { 1, 2,. . .' Sd} such that for all 1 :5 i < Sd, the cells C.(j) and C.(i + 1) share a common face. Third, if we define a graph G = (V, E) by taking a vertex set V = {yl, y2,. . ., y.} and edge set E = {e = {yi, yj} : 1 yi yj 1 < J}, and if G is connected, then the MST(yl, y2,.. ., y.) is a subgraph of G.
We can now begin the proof of Lemma 5. 1. The first observation is a structural one which can be given as follows: there is a random integerr such that
(1) T < oo with probability 1, and
(2) for all n c, the MST(Xl, X2,. XJ is composed of the union of MST({Xl, X2,. X.} n Q) over 1 < i < Md, together with exactly m' 1 edges which join elements of {X,, X2'. . ., XJ that are in different cells Qi
To prove the existence of r we decompose each joi into k' subcells where k is an integer satisfying k 'm '(d + 3) 112 < J. We then define z to be the least value of n such that for all 1 :5 i :5 Md each of the k' subcells of Qi is occupied by at least one element of {X,, X2,. XJ. Standard multinomial bounds

On the number of leaves of a Eudidean minimal spanning tree 821

(and the fact that fl > 0) are enough to show Ez < cc, so r is certainly finite with probability 1.
To verify the second property required of z, we consider the graph G
(V, E) defined by V = {X, X2,. . ., Xn} and E {e = {xi, xj}: 1 xi xj 1 < 8}. By the relation between k and 3, there will be an edge of E between any two Xi's in the same subcells or in two subcells which share a face. By the definition of r and applying the second and third geometrical facts, we see that the union of MST({Xl, X2,. xn} n Q,), 1 < i :5 Md, is a subgraph of G. Since G has no edges which meet two different Qi, we see that an algorithmic construction of MST(Xl, X2,. . ., XJ for n >= r will first construct complete minimal spanning trees on each of the point sets {X,, X2,. . ., XJ n Qi. Since at the time of completion of these Md trees there are m d connected components in the forest which is created by the edgesordered algorithm, we see there are m` 1 edges greater than 8 in MST(X~, X2,.. ., X,). Since for the Md sets, Qi n {X,, X2,. . ., XJ to be joined in a tree requires at least Md 1 edges which hit two different Qi's, we see that for n k_ r there are exctly Md 1 edges of MST(Xl, X2,. . JJ which join two distinct Qi n {X,, X2,. XJ. We have thus verified the second required property of z.
We now have all of the tools needed to complete the proof of Lemma 5. 1. If we let A denote the set of edges of MST(Xl, X2,. . ., XJ which intersect two Ts, then for n k_ r we have 1 A 1 = Md 1. The number of vertices which have different degrees in the tree MST(Xl, X2,. Xn) than in the forest F defined by

md
(5.2) F = U MSTQi n {X,, X2,. XJ)
i_l is therefore at most 2(Md 1). Now, if hi denotes the value off on Qi, then

d
Ni =_ 1 Qi n {X,, X2,. XJ nhi(l 2A)1m '

and we have by Theorem 2 and the law of large numbers that with probability 1, the numbers of vertices of degree k in F is asymptotic to

Md Md
Cek,dNi ~ ak,d E Ni ak,dn f(x)dx.
i=1 f

This completes the proof of Lemma 5. 1.

To move from the class of blocked densities which are treated by Lemma 5.1 to the general densities of Lemma 5.2, we use a lemma from Strassen (1965).

Lemina 5.2 (Strassen). Suppose M, and P2 are probability measures on a bounded subset of R d satisfying yl(A) :5,U2(A) + e for some e > 0 and all closed

822 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

A. Then there is a probability measure v on R' X R' satisfying v(A, Rd) mi(A), v(Rd , A) =,U2(A), and v (B) :5 e where B = {(x, x') : x =i6: xl}.

To complete the proof of Theorem 3, we Choose an e > 0 and let f' denote a blocked density such that 11 f f 11 :5 e. Lety, be the measure with density f and P2 be the measure with density f, and define an independent sequence of random vectors {(Xi, Xi)) with distribution given by v from Strassen's lemma. By the law of large numbers, we then have with probability 1 that

n
lim sup n 1 (X =?~= X1)
noo

If we let Vk, denote the number of vertices of degree k in the
MST{XlI, X21,. X,}, we then have by the corollary to Lemma 2.2 that

lim sup n 'I Vk,, Vk,n 8Dde.
n00

By Lemma 4. 1, we have already seen that with probability 1 Vk, nak,d; SO, by the arbitrariness of e > 0, we also have with probability 1 that Vk, nak,d since there was nothing special about [0, lld except its boundedness. The proof of Theorem 3 is thus complete in the case of random variables with compact support.
In fact, there is almost nothing left to show that Theorem 3 holds for arbitrary densities on R d. If f is any given density, then there is a density fl which has compact support and such that ii f f 11 1 < e. We can then argue exactly as before in the passage from blocked densities on [0, 1] d to general densities on [0, 1]d; one just uses Strassen's Lemma and Lemma 2.2.

6. Information on the constants ak,d

The first task is to show that the constant a of Theorem 1 is strictly positive. It will be clear from the proof that for all d::, 2, we also have al,d > 0.
Because of the relation 0(t) _ 7td given by (3.7) and the identification
= ak,d made in Lemma 3.2, it will suffice to prove (in d = 2) that if L, denotes the number of leaves of the minimal spanning tree M, of the points of the uniform Poisson process on [0, t], then there is a constant ao such that ao > 0 and

(6.1) EL, k aot2

for all sufficiently large t.
To prove (6. 1) we will show that for any square S of area 1, there is a positive probability p that S contains a leaf of M, Since [0, t contains at least lt J1 such disjoint squares, we would then have

On the number ofleaves of a Euclidean minimal spanning tree 823
(6.2) EL, > pltJ2,

and (6.2) is sufficient to show (6. 1).
We may as well consider S [0, 1 ] 2 and partition S into M2 subcells, which we label as Qjj, 1 :5 i :5 m, 1 j 5 m, where we use the natural lattice point labels. When m is odd, S has a unique center cell G + 1,k + 1, if M = 2k + 1, and there are 4m 4 boundary cells Qij which share a face with S. We now let q be the probability that the center cell and all boundary cells have exactly one point of the Poisson process, but all of the other subcells of S are empty. Geometrical arguments which are now quite familiar give us that the point in the center cell must have degree 1 if m k_ 7. This gives us that p k_ q > 0, so (6. 1) is established.
Since the bounds on a which are provided by constructions like the one just given are so intrinsically crude, we do not pursue the numerical evaluation of
q.
It seems hopeless to find an analytic approach that will determine the values Of ak,d. Consequently, we have performed some limited computer simulations for d = 2 in an effort to begin to understand the behavior of the ak,2. In Table 1 we report the results. The size of the simulated tree is denoted by n, and for
k
each n = 2 , 4:_5 k <' 16, 20 MST were simulated. In our estimates for ak,2, the &j were taken to be the average fraction of observed vertices of degree 1 j 5.

TABLE 1
Monte Carlo estimate of frequencies

Sample size Vertex degrees

n a, a, &, 1, a,
16 0.303 0.528 0.159 0.009 0.000
32 0.266 0.542 0.181 0.011 0.000
64 0.243 0.555 0.191 0.010 0.000
128 0.223 0.575 0.197 0.005 0.000
256 0.226 0.561 0.206 0.006 0.000
512 0.223 0.564 0.208 0.006 0.000
1024 0.224 0.562 0.205 0.009 0.000
2048 0.223 0.563 0.207 0.007 0.000
4096 0.221 0.566 0.206 0.007 0.000
8192 0.221 0.565 0.206 0.008 0.000
16384 0.221 0,566 0.206 0.007 0.000
32768 0.221 0.566 0.206 0.007 0.000
65536 0.221 0.566 0.206 0.007 0.000

A vertex of degree 5 is obviously very rare. Still, they have been observed in

824 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

trees of 8K and 16K vertices. Another point to be made by the simulation is that vertex frequencies seem to change very little once n is as big As 100.
Probably the most intriguing speculation to emerge out of the Monte Carlo estimates is that a = a,,2 (the proportion of leaves) is decently approximated by 2/9. The simulations do not strongly support the possibility that a exactly equals 2/9; but, in this connection, it is still interesting to note that Prodinger (1986) established that, if h(') denotes the expected height of the dth highest
n
leaf of a random planted plane tree, then for all e > 0,

h(d) 1 _ 2 d Qs(2s + 2
(6.3) n nn ~1 1) + 0(n").
2 3,, 9 S

We see no way to connect the two models, but the coincidental appearance of the unusual fraction 2/9 seems remarkable enough to mention.
Some caution should be given concerning any conjecture that a = 2/9. In fact, before careful simulations were done, we felt the hypothesis a = 0 was reasonable. This speculation arose because we erroneously thought that leaves had to be near the edge of the point cloud, not in the middle. This false start made clear for us that simulation and analogy are not always trustworthy guides in this area.
Roberts (1968), (1969) give some simulation results which are related to those we have done. In particular, Roberts (1968) gives results concerning the average length of a branch of the MST when the data are sampled uniformly from the unit sphere in two and three dimensions. The results of Roberts (1969) give interesting information concerning the probability that a point of a Poisson process is the nearest neighbor of n other points.

7. Concluding remarks

As the last section reveals, there are many open issues concerning the constants al,,d. Progress on these problems would be very interesting, but such progress is not likely to be forthcoming. For that reason, we focus here on the interesting analytical problems which remain concerning Vk, in view of the relative completeness of our analysis assuming the Xi are absolutely continuous random variables.
In the first place, we should acknowledge that we have treated a result like the strong law of large numbers, yet we have no knowledge concerning a possible central limit theorem. This is an inversion of traditional developments, but there seems to be serious difficulty in developing a central limit theory for nonlinear functionals like our Vk,
A second mystery concerns the behavior of Vk, for singular measures. To be
X~2) 2:jt 3 j, where ~i,j and specific, suppose XiM = 2:jt 1 ~ij 3 j and
are identically distributed, jointly independent sequences such that

i

1

On the number of leaves of a Euclidean minimal spanning tree

Xi = (X~1), A7i(2%
P(~ij = 0) = P(~i,j = 2) = ~1 Now, taking we see the Xi are
independent, identically distributed random variables in the singular support
in [0, lf. For of the subadditive Euclidean functions as defined in Steele
(1981a), suchasthelength of the shortest.tourof the points {X,, X2,. . ., Xn} or
the length of the minimal matching of {X,, X2,. . ., XJ, we have no trouble
dealing with singular random variable's or random variables with a singular
part. For those functionals, the singular part makes no contribution.
The issue with the Vk,n functional is quite distinct. We suspect that for the Xi, 1 5 i < oc, defined above, one has Vk,n ckn for some ck > 0. This is in rather surprising contrast to the other parts of the theory of subadditive functionals. There is also an interesting second layer of subtlety which rests on the fact there is no reason to suppose that ck equals the ak,2 of the absolutely continuous case.
The question of the number of leaves of a minimal spanning tree of a random sample from a singular distribution might well have a close relation to the fractal nature of the support. If that connection is born out, it will provide yet another instance where selfsimilarity (or approximate selfsimilarity) ties into fractal geometry. For several examples of more traditional probability theory which mixes the elements of selfsimilarity, fractals, and singular measures, one can consult the interesting paper of Hughes et al. (1982). For much interesting speculation, the classic source is, of course, Mandelbrot (1977).

Acknowledgernent

Thanks are due to Timothy Snyder for numerous valuable comments on this revised manuscript.

References

BEARDWOOD, J., HALTON, J. H., AND HAMMERSLEY, J. M. (1959) The shortest path through many points. Proc. Camb. Phil. Soc. 55, 299327.
BENTLEY, J. L. (1978) Fast algorithms for constructing minimal spanning trees in coordinate spaces. IEEE Comput. 27, 97105.
BINGHAM, N. H. (1981) Tauberian theorems and the central limit theorem. Ann. Prob. 9, 221231.
DE BRUUN, N. D. (196 1) Asymptotic Methods in Analysis, 2nd edn. Wiley Interscience, New York.
CHIN, F. (1978) Algorithms for updating minimal spanning trees. J. Comput. Syst. 16, 333344.
EFRON, B. AND STEIN, C. (198 1) The jackknife estimate ofvariance. Ann. Statist. 9, 586596.
FRIEDMAN, J. H. AND RAFSKY, L. C. (1979) Multivariate generalizations ofthe Wolfowitz and Smirnov twosample tests. Ann. Statist. 7, 697717.
HILBERT, D. AND COHNVOSSEN, S. (1952) Geomelry and the Imagination. Chelsea, New York.
HOC~IBAUm, D. AND STEFLE, J. M. (1982) Steinhaus' geometric location problem for random samples in the plane. Adv. Appl. Prob. 14, 5567.

826 J. MICHAEL STEELE, LAWRENCE A. SHEPP AND WILLIAM F. EDDY

HUGHES, B. D., MONTROLL, B. W., AND SHLESINGER, M. F. (1982) Fractal random walks. J. Statist. Phys. 28,111128.
JUNG, H.A. (1974) Determination of minimal paths and spanning trees in graphs. Computing 13, 249.
KANG, A. N. C. (1977) Storage reduction through minimal spanning trees and spanning forests. IEFE Comput. 26, 425434.
KATMAINEN, J. (1983) On the worst case of a minimal spanning tree algorithm for Euclidean space. BIT 23, 28.
MALLION, R. B. (1975) Number of spanning trees in a molecular graph. Chem. Phys. Lett. 36, 170174.
MANDELBROT, B. B, (1977) Fractals, Form, Chance, and Dimension. W. H. Freeman, San Francisco.
NAJOCK, D. AND HEYDE, C. C. (1982) On the number of terminal vertices in certain random trees with an application to stemma construction in philology. J. Appl. Prob. 19, 675680.
PENNY, D. (1980) Techniques for the verification of minimal phylogenetic trees illustrated with 10 mammalian hemoglobin sequences. Biochem. J. 187, 6574.
PRODINGER, H. (1986) The average height of the dth highest leaf of a planted tree. Networks 16,6775.
PYNN, C. (1972) Improved algorithm for construction of minimal spanning trees. Elect. Lett. 8, 143.
RtNYI, A. (19 59) Some remarks on the theory of random trees. WA Mat. Ket. Imt. Kozl. 4, 7 38 5 (also Selected Papers ofA]fr~d Rdnyi, Akad6miai Kaid6, Budapest).
ROBERTs, F. D. K. (1968) Random minimal trees. Biometrika 55, 255~258.
ROBERTs, F. D. K. (1969) Nearest neighbors in a Poisson ensemble. Biometrika 54, 401406.
RoGERS, C. A. (1964) Packing and Covering. Cambridge University Press, Cambridge.
RoMANE, F. (1977) Possible use of minimum spanning tree in phytoecology. Vegetation 33, 99106.
SCHMIDT, R. (1925) Ober das Borelsche Summierungsverfahren. Schrifiten d. Konigsberger gel. Geselischaft 1, 205256.
STEELE, J. M. (198 1 a) Subadditive Euclidean functionals and nonlinear growth in geometric probability. Ann. Prob. 9, 365376.
STEELE, J. M. (1981b) Complete convergence of short paths and Karp's algorithm for the TSP. Math. Operat. Res. 6, 374378.
STRASSEN, V. (1965) The existence of probability measures with given marginals. A nn. Math. Statist. 36, 423439.
WALD, A. AND WOLFOWITZ, J. (1940) On a test whether twosamples are from the same population. Ann. Math. Statist. 11, 147162.
WHITNEY, V. K. M. (1972) Minimal spanning tree. Comm. ACM 15, 273.
Wu, F. Y. (1977) Number of spanning trees on a lattice. J. Phys. A 10, L I 13L 115.

1 1