Please see PDF version

Minimal Spanning Trees for Graphs with
Random Edge Lengths

J. Michael Steele

ABSTRACT: The theory of the minimal spanning tree (MST) of a connected graph whose edges are assigned lengths according to independent identically distributed random variables is developed from two directions. First, it is shown how the Tutte polynomial for a connected graph can be used to provide an exact formula for the length of the minimal spanning tree under the model of uniformly distributed edge lengths. Second, it is shown how the theory of local weak convergence provides a systematic approach to the asymptotic theory of the length of the MST and related power sums. Consequences of these investigations include (1) the exact rational determination of the expected length of the MST for the complete graph K,, for 2 < n < 9 and (2) refinements of the results of Penrose (1998) for the MST of the dcube and results of Beveridge, Frieze, and McDiarmid (1998) and Frieze, Ruzink6, and Thoma (2000) for graphs with modest expansion properties. In most cases, the results reviewed here have not reached their final form, and they should be viewed as part of workinprogress.

1 Introduction and Main Results

Consider a finite, connected, simple graph G with vertex set v(G), and for each element of the edge set e(G) let ~, denote a nonnegative random variable that one views as the length of the edge e. The random variables {G : c E e(G)} are assumed to be independent with a common distribution F, and the quantities that are of central concern here are the total length of the minimal spanning tree (MST) of G,
LmST(G)
, ~,1(e E NIST(G)),
eEC

and the associated sums for power weighted edges

L'ST(G) ~c'l(e c MST(C)).
m c
eEG

The first of these sums has been studied extensively since Rieze (1985) showed that for edge lengths with the uniform distribution on [0, 1] that one has

1
E[LMST(K.)l ((3) = E ~3 = 1.202 ... as n , oo (1)
k=l

where K,, is the complete graph on n vertices.
In particular, this result has now been refined or extended by numerous investigations. There are relaxations of the distributional assumption by Steele (1987), extensions to the bipartite MST expectations E[LMST(Kn,J] by F~ieze and MeDiarmid (1989), and even the development of a central limit theorem for LMST(Kn) by Janson (1995). More recently, the basic limit (1) has been extended

1

i
i
1

i

i
i

1

to larger classes of graphs, including an extension to the dcube Qd by Penrose (1998) and extensions to general classes of "modestly expansive" regular graphs by Beveridge, Rieze, and McDiarmid (1998) and Rieze, Ruzink6, and Thoma (2000).
The path taken here diverges from this earlier work in several respects, but one key difFerence comes from the focus on exact calculations, rather than asymptotic relations. Specifically, we provide a formula for E[LmST(G)l that permits one to determine the exact rational value of E[LMST(G)l for many concrete choices of G.
We also pursue exact calculations for a certain infinite graph T that is in a sense the universal limit for any sequence of randomly rooted independently weighted finite graphs whose vertex degrees go to infinity. This calculation then permits us to provide a necessary and sufficient conditions for the determination of the asymptotic behavior of E[L'MST(GJ] for a large class of sequences of graphs.
After framing our main results more fully in the next few paragraphs, we turn to the proofs. In particular, Section 2 develops an exact formula for E[LMST(G)] finite G, and then in Section 3 we calculate the expected length per vertex of a special subgraph of T that holds the key to many of the limit theorems for the MST. Section 4 then addresses some foundational results that connect calculations on T to calculations for sequences of finite randomly rooted graphs, and these results are subsequently applied to complete the proof of the basic limit theorem for E[L'MST(G~)]. Section 4 also examines a critical example that serves to illustrate the role of uniform integrability in the limit theory of the MST. The final section reviews some open problems and briefly speculates on the possibilities for further development.

A Formula for E[LMST(G)]

Theorem 1.1 If G is a finite connected graph and the Tutte polynomiaP of G is T(G; x, y), then for independent edge lengths that are uniformly distributed on [0, 11, one has

E[Lms,.r(G)l (1 p) T,, (G; 11p, 1/(1 p)) dr (2)
1,0, p T(G; 11p, 1/(1 p))

where T. (x, y) denotes the partial derivative of T(x, y) with respect to x.

We illustrate the efficacy of this formula by providing what we believe to be the first explicit computations for E[LMST(Kn)l for finite values of n that go beyond the trivial n = 2 and the easy n = 3. Specifically, we use this formula to calculate E[LMST(KJ] for 2 < n < 9, and these calculations lead to several compelling conjectures.

Asymptotic Consequences of an Exact Calculation

If dn, n = 1, 2, ... is a sequence of integers such that dn ~ 00 as n oo, we say that the sequence of graphs G,,, n = 1, 2, ... is nearly regular provided that the maximum A(G.) degree and the minimum degrees 6(Gn) satisfy the degree conditions
A(Gn) dn and 6(Gn) dn as n > oo. (3)

'Subsection 2.2 provides a brief but friendly development of the necessary background on the Mitte polynomial beginning with its definition.

2

We also relax our assumption on F(x) = P(~, < x), and instead of requiring the that the ~, be uniformly distributed on [0, 11 we only require

F(O) = 0 and F(x) = x + o(x) as x ~ 0, (4)

a condition that simultaneously covers the uniform distribution on 10, 1] and the exponential distribution with mean one our two leading cases. Next, we consider the powerweighted analog to the MST,

UMST (C.) ~0'1 (e E MST(CJ
e

and we introduce a new sequence

Y,,, (G.) = Y_
,(d.~,)c'l(e EE MST(G,) and R(C.) E e), (5)

where R(G) denotes an element of the vertex set v(G) that is chosen independently according to the uniform distribution. While Y,(GJ may not seem natural at first, we will see shortly that its expectation determines the expectation of L"
MST; moreover, there are major technical benefits to working with Y', (G,). In particular, Y, (GJ satisfies a limit law that requires nothing more of the graph sequence {GJ than those features that one needs for the definition of Y,, (G,) and the statement of the limit. As an easy consequence of the general theory of local weak convergence and an exact calculation on a special infinite tree, one obtains the asymptotic behavior of E[L'MST(CJ].

Theorem 1.2 If Gn, n = 1, 2. is a sequence of connected graphs that are nearly regular in the sense of (3), then for any 0 < ce < oo, one has

E[L'MST(Gn)lr(l+c,)((2+a)lv(Gn)ldnc' as n,oo, (6)

if and only if the sequence

{ Ya (Gn) : n = 1, 2, ...} is uniformly integrable.

Thus, one finds that even a crude qualitative measure of the good behavior of the sequence {ya (Gn) : n = 1, 2, ...} is enough to guarantee the regular asymptotic behavior of E1LaMST(GJ]. Moreover, the good behavior of {Y"(Gn)} turns out to be necessary, so one finds a strong hint that this sequence may be more fundamental to the theory of the MST than first impressions might suggest.
As a quick illustration of this last result, we should note that if one takes a = 1 and takes Gn to be Q, the ncube, then it implies

E[LMST(Q.)l 2n ((3) as n ~ oo, (7)
n

a limit which was found by Penrose (1998) by different means. A more novel consequence of the limit (6) comes from taking a = 1/2 and n = 3/2 to find that the limit (7) is nicely sandwiched between

1/2 3/2 3 ~F7rn3/2 n.
E [LMST WJ] 0/2)2 n and E1LMSTW.)l 4 ((7/2) 2
2 411

3

Despite the large swath of ground that Theorem 1.2 covers, one should not lose sight of the fact that it really is a simple corollary of more general result from the theory of local weak convergence that has its roots in Aldous (1992) and Aldous (2001). In particular, local weak convergence to the PWIT is a fundamental part of those papers, and the modest generalization PWIT Limit Theorem developed here in Theorem 4.2 is best viewed as part of a longer term effort to make the techniques introduced in Aldous (1992) and Aldous (2001) more easily accessible and more readily applied.

2 Exact Calculations for Finite Graphs

The program begins with the derivation of an exact formula for the expectation E[LmST(G)l under the uniform model for the edge lengths. Here the first step is to derive a relationship between the random variable LMST(G) and an integral of another random variable that measures the connectedness of G when one just uses edges length not greater than 0 < p :5 1. Versions of this relationship go back at least to Avram and Bertsimas (1992), and in some way or another it has had a role in most recent investigations of the MST, including the central limit theorem of Janson (1999) and the general graph MST results of Beveridge, Fyieze, and McDiarmid (1998) and Rieze, Ruzink6, and Thoma (2000).

2.1 Length of the MST as an Integral
For any finite graph G and any subset A of the edge set e(G), we write k(G, A) for the number of connected components of the graph with vertex set v(G) and edge set A. If each edge e E G is assigned length ~,, then we also write

et (G) = {e E e(G) < t},

and we let
NMST (G, t) t),
eEMST(C)

so NMST (G, t) denotes the number of edges of the MST of G that are elements of et(G). Now, if G is a connected graph, then by counting the number of elements of et (G) in each connected component of (G, et (G)) one finds

NMST (G, t) + k (G, et (G)) = n, so we can simply compute

LMST(C) E MST(C) fi( t < ~e, e E MST(G) dt
eEG eEC
f 1: (1 R(~, < t, e E MST(G) dt
', CEG
J1 (n 1 NMST(G, t)) dt = il{k(G, ,(G)) 1} dt.
0
4

1
1 1

In other words, for any connected graph we have the rather pleasing random variable representation

1 + LmST (G) k (G, et (G)) dt.
0

Thus, our main task is to understand the expectation of k(G, et (G)), and this provides a natural roll for the 'Ditte polynomials.

2.2 The Tutte Polynomial

To define the Tutte polynomial, one needs to go outside the familiar class of simple graphs and to consider graphs that may have loops or parallel edges. Given such a graph G, the Tutte polynomial T(G; x, y) is then defined by a set of four devilishly simple rules:

1. If G has no edges, then T(G; x, y)

2. If e is an edge of G that is neither a loop nor an isthmus, then

T(G; x, y) = T(G'; x, y) + T(G"; x, y),
e e

where G' is the graph G with the edge e deleted and G" is the graph G with
e e
the edge c contracted.

3. If e is an isthmus, then T(G. x, y) = xT(G'; x, y)

4. If c is a loop, then T(G; x, y) = yT(G',', x, y).

To confirm the understanding of these rules, one might want to check that they imply that the Tutte polynomial of K2 is just x; indeed, by successive applications of Rule 3 one finds that the Mitte polynomial of any tree with n vertices is just the monomial x`.
The rules are more amusing when one needs to use contractions, and here the basic exercise is to show that the Tutte polynomial of K3 is x + x' + y. Finally, one might want to cheek that the Tutte polynomial of a bow tie (made by two copies of K3 joined at a vertex) is just (X + X1 + Y)2 . The last exercise naturally suggests a general principle for finding the T~itte polynomial for the graph built by joining two arbitrary graphs at a single vertex; one can then recapture Rule 3 as a special case of the general principle.
Much of the usefulness of the Tutte polynomial comes from its relation to the rank function r(.) that associates to each A c e(G) the integer r(A) given by

r(A) = v(G)l k(G, A),

where, as before, k(G, A) is the number of connected components of the graph with vertex set v(G) and edge set A. The rank function provides a measure of the extent to which the graph (v(G), A) is connected, and it permits one to express the Tutte polynomial as a large but informative sum:
T(G; x, y) E (X _ 1),(C),(A) (Y _ I)IAIr(A), (9)
ACe(C)

5

where r(G) is shorthand for the more pedantic r(e(G)).
One immediate consequence of this formula is that it shows the Tutte polynomial does not depend on the order in which one deletes the edges of G in the recursive definition of T(G; x, y), a fact that may not seem particularly evident from the rules themselves. To return the favor, the defining rules make it evident the coefficients of T(G; x, y) are nonnegative, while this is not so easily seen from the sum.
One obvious consequence of the sum formula (9) is that

T(G;2,2) = 2' where m = le(G)J, (10)

and a natural use of this triviality is to provide a quick feasibility check on a candidate Tutte polynomial. In fact, the evaluations of the Tutte polynomial at special choices of x and y provide a rich buffet of combinatorial interpretations (cf. Welsh (1999)), and in principle each such evaluation can be used as a check. In practice, the evaluation (10) is the easiest to use; it catches many blunders and offers many hints.

2.3 Connection to the Probability Model

Any sum over all of the subsets of e(G) can be interpreted as an expectation

Y7 p [AI (1 _ p)JAI f (A) (11)
ACe(G) for an appropriate choice of f, and when one recalls that

r(A)=1v(G)Ik(G,A)=nk(G,A) and r(G)=n1 for a connected graph G, then the sum formula (9) is simply

T(G; x, y) = 1 1: (Y_1)IAI((X_,)(Y_,))k(C,A),
(X 1)(y jp ACe(G) which obviously may be written in expectation form as

Y m 'A'(,)m'A'((xl)(yl)) k(C ' A) (12)
(X 1)(y 1)n 21 ~y Y
Ace(G) provided that we set m = le(G)1 and make the natural identifications

Y 1 and 1p= 1 . (13)
Y Y

This kind of reinterpretation of the Tutte polynomial is breadandbutter to the theory of the correlated percolation model (cf. Fortuin and Kasteleyn (1972)), and this specific form of the Tutte polynomial has also been useful in the study of the computational complexity of the Tutte polynomial (cf. Welsh (1999) and especially Lemma 1 of Alon, Rieze, Welsh (1994)).
On the other hand, the application of this formula to the problem of calculating the minimal spanning tree for uniformly distributed edges seems to be novel,

6

though admittedly easy and natural. We first note that the first factors under the sum provide the probability under the uniform model that one has ~, < p for exactly those edges in the set A. If one then takes

A = e,(G) =_ {e : e E e(G), ~. < p} then one can write the moment generating function

~p (t) =_ E [exp (tk (G, ep (G)))] in terms of T(G; x, y) as

,p(t)=P",('P),,,+letT G;l+e, 1P, 1 (14)
p p

and this formula gives us a natural way to calculate the expectation of k (G, ep (G)).
Specifically, if we retain the abbreviations (13), we have

Wl(t)=W(t) 1+e'lPTX(G;X,Y) p T(G; x, y) } 1 so, when we let t = 0, we find for x = 11p and y = 1/(1 p) that

E [k(G, ep (G))] = 1 + 1 P Tx (G; x'y) (15)
p T(C. x, y)

Finally, when we expand the abbreviations for x and y and recall the representation (8) for LMST (G) in as an integral of k (G, e, (G)), we find

E[LMST(C)l = 1 (1 p) Tx (G; 1/p, 1/(1 p)) dp, (16)
10 p T (G; 11p, 1/(1 p))

just as we needed to complete the proof of Theorem 1.1.

2.4 Illustrations and Applications
There are some natural and easy cheeks one can make to familiarize the formula (16). If we recall that for G = K2 we have T(G; x, y) = x, then the integral (16) easily works out to be 1/2, just as it should. More generally, if G is a tree with n vertices, then T(G; x, y) = Xn1 and the integral work out to be (n 1)/2, and again this is obviously the correct value of E[LMST(GA.
It is perhaps more informative to note that the form of the integrand as a logarithmic derivative is quite natural. If G and H are two graphs that share a common vertex, then the graph G U H has Tutte polynomial T(G. x, y)T(H; x, y) so the formula (16) recaptures the obvious fact that in this case one also has

E[LMST(G U H)] = E[LMST(G)l + E[LMST(H)].

For the complete graph on three vertices we have already seen that one has T(K3) = X + X2 + y, and for this polynomial the integral (16) yields 3/4, and yet again one can cheek independently that E[LMST(K3)l = 3/4. Nevertheless, for K4

7

n E[LmST(Kn)l Numerical Value Forward Difference
2 1/2 0.50000 0.250000
3 3/4 0.75000 0.135714
4 31/35 0.88571 0.080735
5 893/924 0.96645 0.051864
6 278/273 1.01832 0.035400
7 30739/29172 1.05372 0.025342
8 199462271/184848378 1.07906 0.018843
9 126510063932/115228853025 1.09790

Table 1: The exact expected values of the MST of Kn for n = 2 to n = 9 under the model of independent U[O, 11 edge lengths.

the situation is much more interesting. Hand computations become tedious, but they still suffice for one to show

T(K4; x, y) = 2x + 2y + 3X2 + 3 Y2 + 4xy + x 3 + Y 3.

When this polynomial is used in the integral formula (16), one then finds

E[LMST(K4)l = 31
35'
and now we are on new ground. This appears to be the first time E[LmST(K4)l has been computed, and one may be hard pressed to provide an independent calculation that not pass through some integral like that provided by our basic representation (16).
Naturally one can go further, but beyond n = 4 it would be masochistic not to use symbolic calculation to determine the Tutte polynomials and to perform the required integrations. In fact, a table of the Tutte polynomials T(K,,; x, y) for the values n = 2,3,...,8 is included in Gessel and Sagan (1996), and with help from Maple this table has been extended by Cessel (personal communication) to include all values up to n = 15. For convenience of display, we us just the first nine of these polynomials in the construction of Table 1.
The numerical evaluations in the table and their successive differences suggests two compelling conjectures. it seems inevitable that E[LMST(KJ] is monotone increasing and concave. This evidence is new and not fully digested, so it is possible that these conjectures will follow from our basic formula (16) and the known properties of the Mitte polynomial for K,,. On the other hand, if such an approach is not successful, the conjectures may prove to be difficult. After all, the analogous monotonicity conjecture for the assignment problem (cf. Steele (1997), p. 94) has resisted all attempts for more than fifteen years.
A final feature of Table 1 worth noting is that the rate of convergence is perhaps slower than one might guess. By the result of Frieze (1985) mentioned in the introduction, we know that E[LmST(K.)l converges to ((3) = 1.202..., and one might hope that the behavior of E[LmST(K~)l would parallel that of the partial sums of ((3) given by s,, = 1+ 1/2 3+ . + 1 /n3. Sadly, s. reaches 1. 19 when n = 6 and reaches 1.20 when n = 16, while E[LmST(Kn)l lags far behind. By analogy with the Parisi conjecture for the assignment problem (cf. Parisi (1998) and Aldous and Steele (2002)), one suspects that under the exponential model the corresponding expected values E[LMST(Kn)l will indeed be closer to s,,. Nevertheless, such an exploration will have to wait for another day.

8

3 An Exact Calculation for an Infinite Tree

We now take up a second exact calculation, but this time it will be for a special infinite graph. To explain why this graph deserves to be singled out requires some background on the theory of the Poisson weighted infinite tree and the attending theory of local weak convergence. This background is developed more fully in Aldous (2001) and Aldous and Steele (2002), so the next two subsections recall just the most essential facts.

3.1 An Infinite Tree of Special Significance

The Poisson weighted infinite tree or the PWIT is a simple object. Nevertheless, it provides one with a direct and effective understanding of many of the problems of combinatorial optimization for large graphs with edge lengths that are given by independent random variables.
Formally, a PWIT is a rooted tree that one defines recursively. One starts with a single vertex r called the root, and one gives the root a countably infinite number of children. The set of these children is called generation one, and the edges from the root to the children are then labeled by the realizations of a Poisson process on [0, oo) that has constant intensity 1A > 0. That is, each edge from the root is assigned a unique element of the set

P(IA)={~k:k=1,2,...} where ~k=yl+Y2+'''+yk

and the random variables {Yj : j = 1, 2,...} are independent and

P(Yj > x) = exp(tix) for all j = 1, 2,... and x E [0, oo).

After generation k has been defined, one defines generation k + 1 by taking each element of generation k and applying the same construction that we applied to the root to get the first generation. At each stage the Poisson process that is used to label the edges is taken to be independent of all of the other Poisson processes that have been introduced. This construction is then continued until there is a well defined generation for each of the natural numbers k = 1, 2,....
A tree ] that is produced by this construction is said to be a PWIT with intensity 1A > 0, and, as shorthand, we will write

d
T = PWIT(M),

whenever 7_ has the same distribution as the PWIT we have just constructed. Shortly, we will be more precise about the metric space in which one understands this distributional equality to take place.

3.2 Components of the PWIT

If G is any graph with a real number associated to each edge of G, then G is called a weighted graph, and the numbers on the edges are called the edge lengths. Given such a graph, we let G(x) denote the graph that one obtains when all of the edges of length x or greater are delete, and if G is a rooted graph we also let G, (s) denote the component of G(s) that contains the root. If 7_ is a PWIT with intensity p and root r, then we may again view T.(s) as a rooted graph with root

9

One View of a PWIT T

Each vertex in T has ~Countably many children. ere each triangle represents an infinite tree that is itself a PWIT.

Each edge of T is assigned a length. The lengths from a vertex to its children are determined by an independent Poisson process.

Figure 1: The PWIT is arguably the most fundamental limit object in the theory of randomly weighted graphs. It is the local weak limit of many diflerent sequences, and it offers a unified approach to limit theorems for matching, spanning trees, and many other problems of combinatorial optimization.

r, and this graph turns out to be and old friend. It is nothing more than a Poisson GaltonWatson tree.
More precisely, if PGW(s) denotes the distribution of the random tree determined GaltonWatson branching process with a single progenitor and an offspring distribution that is Poisson with mean s, then we have

T A PWIT(M) =* T,(s) 1 PGW(sp).

Many pleasing computations may be based on this simple observation.
In particular, we will need a qualitative understanding of the size of T,(s) when p = 1, but everything we need has been known for 120 years or more. Specifically, the probability p = p(s) that a PGW(s) branching process is finite (the socalled extinction probability) is one if 0 < s < 1 and for s > 1 the value of p is given by the unique root in (0, 1) of the equation.
p = exp(s(l 0
Lagrange13firmann inversion provides an explicit formula

1 W (,,,,)k
P(S) = 8 1: kk k! for , > 1,
k=l
but, despite its charm, this sum does not always provide the best way to understand p(s), or the complementary probability q(s) = 1 p(s). Here we will rely more directly on the fact that q(s) is the unique strictly positive solution of

1 q(s) = exp(sq(s)) when s > 1, (17)

and the obvious inverse relationship

s(q) log(l q) (18)
q

10

1 1

Either c(G, v,; s) is finite

V2

V1
S

or C(G, V2'~ 8) is finite, or both are finite.

Figure 2: The edge c (V1, V2) of G is in the minimal spanning forest if and only if at least one of the trees 0, v,; s) and c(G, V2; 8) is finite when s = ~,.

that gives us the value of s for which we have we have probability 0 < q < 1 that the total PGW(s) population is infinite.

3.3 Minimal Spanning Forests

The minimal spanning forest of an infinite graph G that has all distinct edge lengths is the subgraph MSF(G) of G with the same vertex set as G and with an edge set that contains each edge c = (V1, V2) of G for which
(1) c(G, v,; s) and c(G, V2; s) are disjoint, and
(2) c(G, v,. s) and c(G, V2; s) are not both infinite when s is taken to be the length of the edge e = (V1, V2) E G. An illustration of this definition is given in Figure 2 from Aldous and Steele (2002).
The real utility of this definition can only be brought out by the PWIT Limit Theorem (Theorems 4.2), but a good exercise with the definition is first to show that each component of MSF(G) must be infinite and then to argue that MSF(G) is indeed free of cycles.

3.4 Zeta Meets a PWIT

Let T be a PWIT with intensityp = 1, and let r denote its root. If MSF(T) is the minimal spanning forest of T, then by a natural extension of our earlier notation we denote the sum of the edges incident to the root by

Y(7_) = E ~,R(e c MSF(T)),
e:rEe

and denote the associated power sum by

Y,,(T) ~cR(, E MSFJ)).
e:rEe

The next lemma exploits the method of Lemma 4 of Aldous and Steele (2002) to obtain a slightly more general result. Although the innovation is minor, there do seem to be genuine benefits to having the parameter a at one's disposal. At a minimum, the joint presence of the gamma and zeta functions is amusing.

Lemma 3.1 The sum of the ath powers of edges of the PWIT that are incident to the root has expectation

E[Y (T)l = 217(1 + ce)((2 + oz) for ce E (1, oo), (19)
and by analytic continuation the same formula holds for all complex a for which the lefthand side is well defined; consequently, one has the Mellin integral representation
2 F xct+i
E[Y.(T)l dx for all Re a > 0. (20)
1 + a 0 e~ 1
Proof.. If one conditions a Poisson process P on [0, oo) to have a point at s, then P \ {s} is again a Poisson process, so, if we condition on the event that there is an edge e of length s incident to the root, then the subtrees obtained by cutting that edge are again independent PGW(s) trees.
Now, since the probability that at least one of these is finite is equal to 1 q'(s), we see that this is also the probability that the edge e is in the minimal spanning forest of T and we have
E[Y,,(T)l = 1000 s'(1 q 2 (s)) ds. (21)

To compute the integral, we apply integrationbyparts, the implicit formula for (17) for q(s), and the fact that q(s) vanishes on [0, 11 to find
(1 + )E[Y.(7)l = 2 100,' 8l+aq(s)q'(s) ds = 2 F 81+cq(s)q(,) ds

= 21' 10g2 (1 q) dq. a
0 q
A good table would now suffice, but it is as easy to substitute u log(l q) to
find
(1 + a)E[Y,,(T)] 2 Ul+c, eu d. = 2 ku du
e
1000 k=l

00
2 E k2+a F(2 + a) = 21P(a + 2)((o, + 2).
k=l

Since one has (1 + a)F(1 + ce) = r(2 + a), the proof of formula (19) is complete.
Finally, the analytic continuation of the identity (19) follows from the general principles of function theory, and the validity of the Mellin integral representation (20) is embedded in our calculations. Alternatively, one can note that the representation (20) follows from formula (19) and the well known formula

F(z)((z) = Xzl dx Rez > 1,
OF ex 1
which one can prove by expanding (ex 1)1 as a geometric series. E

12

4 Local Weak Convergence Theory

We now need to recall (and to modestly extend) some basic facts from the theory of local weak convergence. The main results in this section are the PWIT Convergence Theorem (Theorem 4.2) and the MST Convergence Theorem (Theorem 4.4). The first of these is implicit in Aldous (1992) and Aldous (2001), and, although Theorem 4.2 is nominally more general than the results that were needed in Aldous (1992) and Aldous (2001), no essentially new ideas are needed. Finally, the MST Convergence Theorem is a direct import from Aldous and Steele (2002).
The real benefit of the present development of the PWIT Limit Theorem is that it is reasonably selfcontained. Thus, with very little overhead, one gains direct access to the single most important fact about the PWIT.

4.1 A Poisson Convergence Lemma

We begin with a lemma that is surely part of classic folklore, but the snappy proof via R6nyi's characterization of the Poisson process appears to be new. At a minimum, this proof draws the straightest possible line between the hypotheses on F and the required Poisson limit.

Lemma 4.1 Let F denote a distribution function such that
F(O) = 0 and F(x) = Mx + o(x) as x ~ 0. (22)

If the random variables of the triangular array 1 < i < d,,} are independent
within each row and if one has

P (~i,n < x) = F(xldn) for all 1 < where dn oe as n oo, then one has the weak convergence

S,, = ~i,,, : 1 < i < dn d ~ %) as n oo
in the sense of point processes.

Proof.. By R6nyi's characterization of the Poisson process (cf. R6nyi (1967) or Kingman (1993), pp. 3437), it suffices to show that for each union of disjoint intervals A = (a,, bi] U (a2, b21 U ... U (a, bn] one has

A(A)p
P(Is, n AI = 0) , e (23)

where \(A) denotes the Lebesgue measure of A. By our hypothesis on F and the independence of the {Gn : 1 < i < dn}, we have

k d,
P( ISn n AI = 0) = 1 Y~ ai/dJ}
JF(bild.) F(
i=1

= (1 MA(A)/dn + o(l,,\(A)Id. ))d., so the limit (23) follows instantly. G

There is a sense in which R6nyi's criterion is modestly magical; it provides us with independence of a different sort than we assume at the beginning. Also

13

one should note that the only sly aspect of R6nyi's Theorem is the requirement that one deal with all A that can be written as finite unions of disjoint intervalsin fact, Moran (1967) shows by example that one cannot get by with less. Finally, there is one small technical point; we have used R6nyi's characterization of the Poisson process to provide convergence criterion for of a sequence of processes. Naturally, one only needs to apply the usual subsequence argument to pass from the characterization to the convergence criterion.

4.2 Local Weak Convergence Defined
We now need to extend the classical notion of weak convergence for point processes to a larger domain that is more directly connected with the convergence of weighted graphs. The treatment given here follows the exposition of Aldous and Steele (2002) which was designed in part to systematize the basic constructions used in Aldous (1992) and Aldous (2001).
To begin, we consider a graph G with a vertex set v(G) that may be finite or countable. We further suppose there is a function ~ from the edge set e(G) to (0, ool, and we call ~(e) the length of the edge e. We then use f to define a metric on v(G) by taking the distance from u to v as the infirnurn over all paths between u and v of the sum of the lengths of the edges in the path. Naturally, the distance from any vertex v to itself is taken to be zero.
Now, if G is a connected graph with a countable or infinite vertex set and if is an edge length function that makes G locally finite in the sense that for each vertex v and each real p < oo the number of vertices within distance p from v is finite, then we call G a geometric graph. Also, when there is a distinguished vertex v, we say that G is a rooted geometric graph with root v, and to save space, we denote the set of geometric graphs by G and the set of rooted geometric graphs by G~.
The key issue is to say what one means for a sequence {G,} of elements of G, to converge to a G in 9,. The driving idea is that for large n, the rooted geometric graph G. should look very much like G in a neighborhood of the root of G that is as large as we like.
Formally, we take p > 0 and let Np(C) denote the graph whose vertex set Vp(G) is the set of vertices of G that are at a distance of at most p from the root of G and whose edge set consists of just those edges of G that have both vertices in Vp(C). One again views Np(G) as an element of 9. with edge length function and root given by those of G. Also, p > 0 is called a continuity point of G if no vertex of G is exactly at a distance p from the root of G.
Now, at last, we say that G. converges to G,,, in G. provided that for each continuity point p of G. there is an no = no (p, G.) such that for all n > no there exists a isomorphism 2 ^fn,p from the rooted geometric graph Np(G.) to the rooted geometric graph N,(Gn) such that for each edge c of Np(G.) the length of yn,,(e) converges to the length of c as n ~ oo.
This definition determines a topology that makes 9, into a complete separable metric space. As a consequence, and it gives us access to the usual tools of weak convergence theory. Here, if {X,,} is a sequence of G,valued random variables and X is a G~valued random variable we write

X. d > X to mean that E[f (X.)] E[f (X)]

'Graphs G and G' are isomorphic provided that is a bijection 0 : v(G) v(G') such that (,b(u), ~5()) E c (G') if and only if (u, v) E e(G).

14

for each bounded continuous function f : G, ~ R. This is just plain vanilla weak convergence 9,tarvalued random variables, but to emphasize the special attention that is paid to the neighborhood of the root we also say that we have the local weak convergence of Xn to X.
F~om examples one finds that local weak convergence is a perfectly natural notion, despite the fact that it takes a while to make precise. In fact, the only subtle feature about local weak convergence is the way in which it force one to focus so myopically on the neighborhoods of the root.

4.3 The PWIT Limit Theorem
We now have the background in place to prove the theorem that makes us interested in PWIT; it shows that the PWIT is arises as the limit of a very natural sequence of geometric graphs. As noted earlier, this particular version of the PWIT limit theorem is intended to make the PWIT limit ideas from Aldous (1992) and Aldous (2001) more explicit, more accessible, and modestly more general

Theorem 4.2 (The PWIT Limit Theorem) Let G,, n = 1, 2,..., denote a sequence of graphs such that the vertex set v (Gn) has cardinality n for each n = 1, 2, and such that the maximum and minimum degrees satisfy the degree conditions

A(GJ dn and 6(G,) dn as d,,+ oo. (24)

Also, let F denote a distribution function that satisfies the conditions (22) and associate to each e E v(Gn) an independent edge length ~e with distribution

P(~ :5 x) = F(xldn) for all x > 0.

Next, independently choose an element of v(GJ according to the uniform distribution, and let Gn denote the rooted geometric graph produced by this construction. One then has
d
Gn > PWIT(Ii).

As one often does in the theory of weak convergence, we prove this limit theorem by passage to an equivalent characterization theorem. Specifically, one first argues (in a step that we leave as an exercise) that the sequence of 9,valued random variables {Gn} is tight. Then we consider an arbitrary subsequence, say {nk : k = 1, 2.}, and we note by tightness that there must exist a further subsequence {Mk : k 2,...} and a g,valued random variable G such

G,, ~ G as n ~ oo.

Next, we observe by Skorohod's theorem (cf. Dudley (1989), pp. 325327) that one can assume without loss of generality that

G,, ~> G almost surely as n ~ oo,

and now all we have to do is to prove that G is actually a PWIT.
FYom the definition of the topology of local weak convergence, we know automatically that G is connected, so G will be a tree provided that we show that it has no cycles. This will follow from the next lemma.

15

For the statement of the lemma, we note that a pathplus cycle is a graph that can be written as a path plus one additional edge that makes a cycle by joining two vertices on the path. Also, to anticipate the application of the lemma, one should recall that the root of G. is randomly chosen uniformly from the vertex set v(G.), so the lemma immediately implies that with probability one the limit graph G has no cycles in any p neighborhood of its root.

Lemma 4.3 Let S(n, p) the set of all vertices v E v(Gn) for which there exists a pathpluseycle H = H(v) C: Gn such that

vEH and < P.
eEe(H)

One then has
18(n,p)l/n p> 0 as n+oo.

Proof.. We first note that the number of pathpluscycle subgraphs of Gn with k vertices cannot be larger than

n A(G~)k1 (k) < 'nk 2 A(Gn )kl.
2 2

Also, by our hypothesis on F, we know there is an xO such that F(x) < 21Ax for all 0 < x < xO, and from this bound, integration by parts, and induction one finds a corresponding bound for the kfold convolution is given by

F (k) (X) :~ (2 ,)k X for all 0 < x < xO. C!

Thus, for any k edges el, e2,..., ek of G. we find from the distributional assumption P(~, :5 x) = F(xld.) that

P(~, + + :5 p) :5 (2p)k P for all 0 < p14 < xo.
dk k! n

The expected number vertices of Gn that are contained in pathpluseycle subgraph Gn with k vertices and total length bounded by p is therefore no larger than

(2p) k k 2 A (Gn )klpk provided that 0 < p < xodn.
dk k! n
Now, since A(GJ dn, we may chose a constant C = C(IA, p) such that this bound is not larger than nCk /dnk!, and, thus, one finds

E (1 5(n, p) :5 e'n/d,, for all n such that dn > p1X0.

Since we assume dn oe, this bound is more than one needs to complete the proof of the lemma.

Now that we know C is a tree, the proof of Theorem 4.2 will be complete provided that we confirm that the edge lengths from each vertex to its children are given by the realization of an independent Poisson process. For the root of G this

16

is quite easy. When we look at the edges incident to the root of G,, for large n, we see by Lemma 4.1 that the lengths of these edges are approximately the points of a Poisson process, and consequently the lengths of the edges incident to the root of G must exactly follow a Poisson process.
Now consider a fixed p and an n so large that the probability that Gn contains a cycle in the pneighborhood of the root Of Gn is small. We know that the distances to the children of the root approximately follow the initial segment of a Poisson process, and now we consider the second generation. Let c be a fixed child of the root r, and consider the set S of edges incident to c. The set of edge lengths {~, : e E S and e :,4 (r, c) } again satisfy the assumptions of Lemma 4. 1, so the distribution of the lengths of the descendants of c will again follow a Poisson process as closely as we like. This argument shows that the first two generations of G are consistent with the construction of the PWIT. There is no change to the argument as one moves from the kth to the k + Ist generations, so one finds that G is indeed a PWIT and the proof of Theorem 4.2 is complete.

4.4 Convergence of MSTs

We now need a general result from Aldous and Steele (2002) that tells us that the local weak convergence of a sequence of randomly rooted graphs automatically gives us the local weak convergence of their associated MSTs.

Theorem 4.4 (MST Convergence Theorem) Let G,,, denote a 9. valued random variable such that with probability one G,,. has infinitely many vertices and no two of the edges of C have the same length. Further, let {G, : n = 1, 2,...} denote a sequence of 9,valued random variables such that for each n the distribution of Gn is given by the standard construction and such that for each n the vertex set of Gn has cardinality n with probability one. If

Gn d ~ G,,, as n ~ oo, (25)
then one has the joint weak convergence in G, x g,

(Gn, MST(CJ ) '~ (G., MSF(G.)). (26)

Further, if Nn denotes the degree of the root of MST(Gn) and N denotes the degree of the root of MSF(GJ

Nn d ~ N and E[Nn] ~ E[N] = 2, (27)

and, if Ln denotes the sum of lengths of the edges incident to the root of MST(Gn) and L denotes the corresponding quantities for MSF(GJ, then

Ln d + L. (28)

4.5 Closing the Loops

Theorems 1.2 is now a remarkably easy corollary of the PWIT Limit Theorem, the MST Convergence Theorem, and the exact PWIT calculation developed in Section 3. The first step is simply to make the link between LcST(Gn) and Ya(Gn) m
more explicit.

17

Here it is useful to let ER [f (GJ] denote the conditional expectation of f (GJ given {~, : c E v(GJ}; in other words, we just average f (GJ over the possible values of the random root R. We now just compute

L'MST(C.) = dn'Y7(dn~,)c'E(e E MST(GJ)
e

= dne' (dnG)aR (e E MST(Gn)
2
v e

1 ' IV(Gn)l ER [1:(d~e)~(e E MST(Gn) and R(Gn) e)]
= 2 dn c

1 d' IV(GJ1 ER[ya(GJ]. 2 n

Finally, if one takes expectations in this representation one finds the basic identity

E[L ST(GnA = 1 d' IV(GnA E[Y.,(G.)]. (29)
m 2 n

Now, by the PWIT limit theorem we already know that Gn d > T, and by
the MST Convergence Theorem this automatically entails MST(Gn) ~ MSF(T).
From the defining topology of local weak convergence, we then have

Cl(e E MST(Gn) and R C~ c) d ~'fi(e E MSF(T) and R E e),
e e
eEe(Gn) eEe(T)

or in other words
Y111 (C.) d ya (T) (30)
Now, if {Y (GJ : n = 1, 2, ..} uniformly integrable, we can take expectations in the limit (30) and apply our earlier calculation of E[Y.(T)J to find

E[Y,,(GJ] ~ 21P(1 + a)~(2 + a),
but by the introductory identity (29), this is equivalent to

E[MST(GJ] do'lV(GJ1 F(I + c,)~(2 + ce), (31)
m n

so the direct part of Theorem 1.2 is complete.
The converse now comes almost for free. One first notes that we may reverse the path from the limit (31) to the convergence of the expectations E[Y,,(GnA, so when one pairs this fact with the limit (30), the loop is closed by applying the following simple lemma.

Lemma 4.5 If a sequence of nonnegative random variables Xn, n = 1, 2,... converges to X in distribution, then one has

E[Xn]+ E[X] if and only if {Xn : n = 1, 2,...} is uniformly integrable.

18

Proof.. If we assume that the sequence {X,, : n = 1, 2,...} is uniformly integrable then convergence in probability implies Xn convergence in L' and this certainly implies that one has the convergence of the expectations. For the converse, we first note that by the Skorohod embedding theorem (Dudley (1989), pp. 325327), there is no loss of generality if we assume that X, converges almost surely to X. In this case, the nonnegativity and convergence of the expectations implies that Xn converges to X in L' by Sheff6's lemma (Williams (1991), p. 55). Since L' convergence is stronger than uniform integrability of {X, : n = 1, 2,...}, the proof of the lemma is complete. E]

4.6 An Illustrative Example
Theorem 1.2 tells us that the limit behavior of E[L'MST(GJ] is determined once one shows the uniform integrability of the sequence {Y,,(Gn)}. The systematic treatment of this question will be left for another time, but one should note that this reasonably crude and qualitative property of {Y,(GJ} often follows from known results. Nevertheless, there are certainly many situations where uniform integrability fails or where the proof of uniform integrability can be subtle.
The example we consider here is illustrated in Figure 3, and it has also been used for illustration in Beveridge, Frieze, and MeDiarmid (1998) and Frieze, Ruzink6, and Thoma (2000). If one takes Gn = C(m,, Kj,'), then one can see just from Frieze's ((3) theorem that under the model of uniformly distributed costs that one has

E[LMST(GJ] m,(((3) + 1) as n ~ oo

for any choice of the sequence 2 < m,, < oo. On the other hand, uniform integrability of {Y(Gn) : n = 1, 2,...} would imply that

E[LMST(CJ] m.((3),

so in this case we certainly know {Y(GJ : n = 1, 2,...} is not uniformly integrable. Nevertheless, one might want to check this directly, and, in fact, a moment's thought about the impact of a random root is all one needs to show

lim sup E[Y(GJR(Y(Gn) ~~ Q ~~ 1 for all 0 < t < oc.
nc>o

The situation is more interesting in case one takes 0 < a < 1, in which case one may now show that

lim lim sup E[Y., (GJR(y. (Gn) > t)l = 0, t 00 noo

so the sequence {Y,,(Gn)} is uniform integrable. As a consequence, one obtains a positive result for C(ra, Kj, ') that asserts

Em
[MST(C(Mn,Kn F(ce + 1)((a + 2)Mnnl

a fact which is perhaps more amusing when made more concrete. If one takes
mn = In'J and ce = 0.99, then 17(1.99) = 0.995 and ((2.99) = 1.204. so one
finds
EILO99
MST(C(n, Kn cn where c = F(L99)((2.99) = 1.1990...

19

Qrn, KZ') has mn vertices and mn(n 1)/2 edges.

C(6, K~1)

C(m, K,,') is regular
with degree n 1; a dotted
edge is as honest as any other.

Figure 3: The graph C(6, Kj1) is built out of 6 copies of K4 that have had one edge removed. These altered graphs called Kj's are then chained together in a cycle. The result is a regular graph with degree 3 and 6 . 4 = 24 vertices,. in grocer's terms one has a cubic graph with two dozen vertices.

5 Concluding Observations

As noted earlier, this is a report on workinprogress and there are many loose ends that time and diligence may suffice to resolve. Perhaps the most compelling questions that have been left open concern the monotonicity and concavity of E[LMST(KJ] under the uniform model. Next on the list would be the possible analog of Parisi's conjecture and an exploration of the relationship of E [LMST (K.)] to S, = 1 + 1/2 3 ++ 1/(n 1)' under the exponential model.
More generally, the exact formula (2) for E[LMST (G)] provides one with considerable motivation to work out detailed representations for the Mitte polynomials for those graphs that are of most interest in probability theory. One also suspects that interesting consequences may flow from the interpretation of formula (2) in the light of general results for the Tutte polynomial. Specifically, one might speculate that results like the Negami Splitting Formula (Negami (1987)) could lead to interesting inferences. Finally, the appearance of the logarithmic derivative of the Tutte polynomial in formula (2) suggests that this rational function may have informative properties beyond those it inherits from the Mitte polynomial.
The questions that are left open from the second part of the report are less well formed. Certainly, the question of uniform integrability of {Y,(G,)} deserves more systematic thought. Right now the easiest paths to uniform integrability freeload on the efforts of the more direct approaches to E[LmST(C)], especially the recent arguments of Frieze, Ruzink6, and Thoma (2000) that exploit the lovely bound of Karger (1999) on the number of approximately minimal cuts. Nevertheless, as the easy example of Subsection 4.6 suggests, the sequence {Y,,(G.)} does have an independent character. One suspects that in time the direct investigation of its uniform integrability will lead to arguments that do not poach on other approaches.
Finally, the second part of the report suggests several overarching questions from the theory of local weak convergence. One major line of investigation that

20

I

surely deserves a sustained effort is the extension of the MST Convergence Theorem. There are many other classes of subgraphs from combinatorial optimization for which one expects an analogous result, and the class of confirmed examples is growing. Nevertheless, the final form of this theory is nowhere in sight.

ACKNOWLEDGEMENTS: It is a pleasure to thank David Aldous for continued coaching in the theory of local weak convergence and to thank Ira Gessel for providing an electronic extension to n = 15 of the table of Tutte polynomials for K,, given in Gessel and Sagan (1996).

References

[11 D.J. Aldous (1992) Asymptotics in the random assignment problem, Probab. Th. Rel. Fields, 93, 507534.

D.J. Aldous (2001) The ((2) limit in the random assignment problem, Random Structures Algorithms, 18, 381418.
D.J. Aldous and J.M. Steele (2002) The Objective Method and the Theory of Local Weak Convergence in Discrete and Combinatorial Probability (ed. H. Kesten), SpringerVerlag, New York. [in press]

N. Alon, A. Frieze, and D. Welsh (1994) Polynomial time randomized approximation schemes for the Tutte polynomial of dense graphs, pp. 2435 in the Proceedings of the 35th Annual Symposium on the Foundations of Computer Science (S. Goldwasser, ed.), IEEE Computer Society Press, 1994.

F. Avram and D. Bertsimas (1992) The minimum spanning tree constant in geometric probability and under the independent model: a unified approach, Annals of Applied Probability 2, 113130.

A. Beveridge, A. Frieze, C. McDiarmid (1998) Minimum length spanning trees in regular graphs, Combinatorica 18, 311333.

R.M. Dudley (1989) Real Analysis and Probability, Wadworth Publishing, Pacific Grove CA.

[8] A.M. Frieze (1985) On the value of a random minimum spanning tree problem, Discrete Appl. Math. 10, 4756.

[9] A.M. Frieze and C.J.H. MeDiarmid (1989) On random minimum lenght spanning trees, Combinatorica, 9, 363374.

[10] A. M. Frieze, M. Ruszink6, and L. Thoma (2000) A note on random minimum lenght spanning trees, Electronic Journal of Combinatorics 7, Research Paper 5, (5 pp.)

[11] I.M. Gessel and B.E. Sagan (1996) The Tutte polynomial of a graph, depthfirst search, and simpicial complex partitions, Electronic Journal of Combinatorics 3, Reseach Paper 9, (36 pp.)
T.H. Harris (1989) The Theory of Branching Processes, Dover Publications, New York.

21

[13] S. Janson (1995) The minimal spanning tree in a complete graph and a functional limit theorem for trees in a random graph, Random Structures and Algorithms 7, 337355.

[14] D. Karger (1999) A randomized fully polynomial time approximation scheme for the allterminal network reliablity problem, SIAM Journal of Computing 29, 492514.

[15] J.F.C. Kingman (1993) Poisson Processes, Oxford Universtiy Press, New York, 1993.

[161 P.A.P. Moran (1967) A nonMarkovian quasiPoisson process, Studia Scientiarum Mathematicarum Hungarica 2, 425429.

[171 S. Negami (1987) Polynomi Invariants of Graphs, 'ftansactions of the American Mathematical Society 299, 601622.

[18] G. Parisi (1998) A conjecture on radom bipartite matching, ArXiv Condmat 9801176.

[19] M. Penrose (1998) Random minimum spanning tree and percolation on the ncube, Random Structures and Algorithms 12, 6382.

[20] A. R6nyi, (1967) Remarks on the Poisson Process, Studia Scientiarum Mathematicarum Hungarica 2, 119123.

[211 J.M. Steele (1987) On Frieze's ((3) limit for lengths of minimal spanning trees, Discrete Applied Mathematics, 18 (1987), 99103.

[221 J.M. Steele (1997) Probability Theory and Combinatorial Optimization, NSFCBMS Volume 69. Society for Industrial and Applied Mathematics, Philadelphia.

[23] D. Welsh (1999) The Tutte Polynomial, Random Structures and Algorithms 15, 210228.

[24] D. Williams (1991) Probability with Martingales, Cambridge University Press, Cambridge.

J. Michael Steele
Department of Statistics, Wharton School University of Pennsylvania Philadelphia, PA 19104 steele@wharton.upenn.edu

22

i