WorstCase Greedy Matchings in the Unit dCube

Timothy Law Snyder

Department of Computer Science, Georgetown University, Washington,

DC 20057

J. Michael Steele

Program in Statistics and Operations Research, Princeton University,

Princeton, New Jersey 08540

The worstcase behavior of greedy matchings of n points in the unit dcube, where d ~~t 2, is analyzed. The weighting function is taken to be the a'th power of Euclidean distance, where 0 < a < d. It is proved that the asymptotic growth rate of the weight of such a greedy matching is exactly pn (d)l, where P is a positive constant that depends on the parameters a and d. Included in the analysis is a minimax theorem equating the worstcase behaviors of matchings resulting from greedy algorithms that, when ordering edges for the greedy process, break ties in different ways.

1. INTRODUCTION

The purpose of this paper is to determine the asymptotic behavior of the total weight of a worstcase Euclidean matching obtained by the greedy algorithm. Before reviewing the results that motivate this investigation, we will first make our problem precise and state the details of our main results.

A matching of the graph G = (V, E) is a subset M of the edges of G such that a vertex of V is incident with at most one edge of M. Since all our matchings will be subsets of the complete graph on V, = {Xl, X2, . . . 5 X0, a set of n points in the unit dcube, we will refer to a. matching as a matching of the point set Vn.

A simple heuristic for computing an approximate minimal matching of a given set of points is the greedy algorithm, which iteratively forms a matching M of the point set V, by initializing M as the empty set and considering the edges of the complete graph on V, in order of nondecreasing weight. At each iteration, a candidate edge e is placed into M if neither endpoint of e is incident with an edge of M. A matching formed using the greedy heuristic is called a greedy matching. In cases where there exist more than one choice of orderings of candidate edges in nondecreasing order, we will often find it beneficial to

NETWORKS, Vol. 20 (1990) 779800

(D 1990 John Wiley & Sons, Inc. CCC 00283045190/060779022$04.00

1

1

780 SNYDER AND STEELE

consider the specific ordering of candidate edges to be independent of the greedy process, in which case we will refer to applying the greedy algorithm to a given candidate edge sequence. Although the greedy heuristic can also be applied to form an approximate maximumweight matching in an obvious way, we shall always use the term greedy in the context of minimal matchings.

Let e = {xi, xj} denote an edge of the complete graph on V,. We will use 1 e to denote the usual Euclidean length J xi xj 1, and we will be concerned with the edge weighting function w defined by w(e) = 1 e 1, where 0 < a < d. Powerweighted edges are edges whose length is weighted according to the weighting function w. Our study of powerweighted edges is motivated by simulations in which squared Euclidean edge weights are used in lieu of Euclidean lengths. Such weights are often used to avoid the extra computational overhead of squareroot calculations. Although the case a = 1 is still of primary concern, the study of the more general case 0 < a < d has independent interest, and, in addition, this increased generality helps to illustrate some key features of our method.

In particular, one should note that the edge weighting function w typically fails to provide a metric on V, since the triangle inequality can fail to hold for all a =k 1. The features of w that turn out to be essential are that w is a homogeneous, monotone function of Euclidean length. Because of the key role played by rescaling and approximate selfsimilarity, homogeneity is critical.

Our fundamental object of interest is

pg (n) = sup max 1 1 e I': G is a greedy matching of Vn (1. 1)

V'C[O'lld G LEEG

IV~l=n

We think of pg(n) as the maximum possible total weight of a greedy matching of n points in the unit dcube. For E> 0, we will call a greedy matching having total weight at least pg(n) E an eworst case greedy matching with powerweighted edges. We will subsequently refer to an Eworstcase greedy matching simply as a worstcase greedy matching, and the set of points producing such a matching will be called a worstcase point configuration.

The internal maximum in the definition of p,(n) is a consequence of the fact that if there exist edges of the complete graph on V, that are of identical length, then there can exist more than one greedy matching of Vn.

Our main asymptotic result for pg is summarized in the following theorem.

Theorem 1. There exists a positive constant 8, depending on the dimension d 2 and the edgeweighting function w(e) = 1 e 1% where 0 < a < d, such that as n

p.(n) ,6,n (da)ld. (1.2)

This result provides the first determination of the exact asymptotic rate of

i i

i

i

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 781

growth of the function p, and it also represents the first investigation of greedy matchings under powerweighted edges.

Some analogous results for the minimal spanning tree and for the optimal traveling salesman tour of n points in the dcube are given in Steele and Snyder [121, but the lack of minimality of a greedy matching and the fact that there can exist more than one greedy matching of a given set of points force one to tackle a number of new issues in the analysis of the greedy heuristic.

An issue that must be addressed is that of tied edges, or edges of identical Euelidean length in the complete graph on V,. Different lists of the edges of the complete graph on Vn arranged in order of nondecreasing Euclidean length can produce greedy matchings of significantly different weights, so we obtain a family of greedy matchings of Vn that depend on the resolution (or ordering) of the ties.

The definition of p, in (1.1) considers the worstcase weight of a greedy matching formed by an algorithm that breaks ties in the worst possible way, forming a maximumweight greedy matching of V,. We can also define the worstcase performance of an optimal greedy algorithm in the sense that the algorithm resolves ties so as to attain the cheapestpossible greedy matching:

A9 (n) = sup min 1 1 1 e I': G is a greedy matching of V,, (1.3)

VnC[0,11 d G ,EEG

IV"I=n

Fortunately, the resolution of ties turns out to be of no consequence for worstcase point configurations; this is indicated by the following result.

Theorem 2. For all n :2t 1, one has that

(n) = p. (n). (1.4)

All told, it may be somewhat surprising that the two quantities of Theorem 2 are equal. In addition to telling us that one has more than the asymptotic equivalence fig(n)pg(n) as n,,, the result tells us that the worstcase weighted matchings of all greedy algorithms are equivalent for all n ~: 1.

Previous work on greedy matchings has focused on bounding the quantity p,(n). For example, ~vis ~ll used a hexagonal lattice sphere packing to show that in d = 2 we have that p,(n) ~t 0.8474Vn, and by developing upper bounds on the minimum distances between points of the unit square, Avis [1] showed that p,(n) :5 1.074Vn_ + 0 (log n). It is essential to note, however, that these big0 and £1 bounds on pg(n) do not address the issue of convergence of pg(n)l n(da)ld to a constant.

The only known exact asymptotic results for greedy matchings is the probabilistic result of Avi~ et al. [31. If we let Gn be a greedy matching of

d

{Xl,X2,. .. Xn}C[O, 11 ' where the Xi are independent, identically distributed

random vectors and where d ~ 2, then Avis et al. established the following:

i

i

i 1

782 SNYDER AND STEELE

Theorem. With probability one as n one has

G, C (d)n (dl)ld fRd [f(X)Ild 1,1d dx. (1.5)

Here, c(d) is a constant depending on the dimension d and f is the density of the absolutely continuous part of the distribution of the Xi, where 1 :5 i 25 n. We note that the rate of growth of Gn in this theorem is similar in form to the rate of growth of pg in our Theorem 1, and, indeed, the existence of this probabilistic result motivates our investigation of the corresponding worstcase problem.

The survey of matching heuristics given by Avis [21 goes further into the history and motivation of our problem than we can here, but we should mention some salient aspects of earlier work. Euclidean greedy matchings have been used in applications such as plotting a graph using a mechanical pen plotter [2,8,101 and as a heuristic in Christofides' algorithm for finding an approximate traveling salesman tour of n points whose edges satisfy the triangle inequality [41. For graphs with large numbers of vertices such as those from applications in VLSI, the wellknown 0(n') minimal matching algorithm of Edmonds [51 and even the 0(n 3 ) refinement of Gabow [61 and Lawler [9] are too slow for large numbers of points, making faster heuristics desirable.

In what follows, we will at all times focus on worstcase results for points in [0, 11', without any probabilistic assumptions. In the next section, we spell out the precise properties that are needed by pg in order to establish the asymptotics reflected in Theorem 1. The heart of the matter is to show that pg satisfies the approximate recursion relation (2.1(b)) of Lemma 2.1. The first step of this verification is the construction of a point set for which a greedy matching nearly obtains the worstcase weight; this is accomplished in Section 3, and it leads to a preliminary recursion that works together with the geometric inequalities of Sections 4 and 5 to verify the hypotheses of Lemma 2.1. Section 4 bounds the maximum number of relatively long edges of a greedy matching in the unit dcube, and Section 5 bounds some incremental rates of growth of pg. These ingredients are brought together in Section 6 to prove Theorem 1.

The proof of Theorem 2 is handled in Sections 79, and Section 10 offers concluding remarks, including some comments on open problems.

2. TARGET RECURSIONS

Before attacking the analysis of pg, we need a couple of general lemmas that are not directly related to matchings. Although the first lemma looks technical, it tells us just which properties of p. must be established. The first such target recursion, relation (2.1(a)), expresses a slow incremental rate of growth condition, whereas the second, (2.1(b)), guides the majority of the subsequent work.

Lermna2A. If p(l) = 0, 0< a

i

1

i

i

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 783

(a) p(n + 1) s p(n) + cn"d and

(b) Md~p(k) Md~k(da)ld r.(k) p(mlk),

where r. (k) > 0 as k then as n > ,

p(n) _j8n (da)ld

for a constant P 0.

Proof. From the hypothesis (2.1a), we first note that for 1:5i

j1

PG) P(i) 1 {p(k + 1) p(k)}

k=i

f1

:5c 1 k`d (2.2)

k=i

' 1

:5 c f Xald dx

:5C1 d {j(da)ldi(da)ld},

da

where the constant c' depends on a. Letting i = 1 and j n in Inequality

(2.2) and using the fact that p(l) = 0 shows us that p(n) satisfies

p(n):Sc'(d/(d a))n (d)1d, so if we define

T(k) = p(k) (2.3)

k(d.)ld' we see that T(k) is bounded. This allows us to set

y = lim sup T(k) < (2.4)

k

Dividing Inequality (2.1(b)) by Md ak(da)ld gives us the simplified relation

*(k) r,,,(k) 5 T(Md k) (2.5)

for all m ~t 1 and k ~~t 1. Now, given any positive real number E, we can choose a k, such that e :5 T(kj and r,, (k,) :5 E, from which (2.5) becomes

2E:5 *(Md kj (2.6)

for all m 1.

784 SNYDER AND STEELE

We now consider the intervalsj,,:5 n:5j,,,1, wherej, = Md k,,. To bound the absolute difference IIP(n) *(jj, we first use Inequality (2.2) to check how much p(n) can differ from p(j,): sup 1 p(n) p(j~)i 1 p(jn,l) p(j.)1

j:5 n:sj~+1

cl d k (da)ld {(m + 1)d da}. (2.7)

da ~

Using the mean value theorem, we then bound (M + 1)da M1a, obtaining two cases that depend on the size of the parameter a, namely,

(M + 1)da M d (d a)(m + 1)dla, if a:5 d 1; (2.8)

(d a) Md1a, if a > d 1.

We can then express the bound (2.7) in terms of to get:

1*(n) *(j.)1 I p(n) p(jn)l

j(d.)ld

M

c'd 1_ (M+, da if a:s d 1;

M M (2.9)

c'd 1 , if a > d 1

M

:5 2de'd 1

M

for j.:5 n j~+ 1, where we have bounded (M + 1)d~1Md~ by 2 d. Hence,

2 dC 'd 1 *(n) T(j~) (2.10)

M

for j,:5 n:5j,,,+1.

Returning to (2.6), we now see that

2dcV 1 2E T(n) (2.11)

M

for any jm::5 n j, + 1. Thus lim infk T(k) ~t y 2E, and since e > 0 is arbitrary, we can conclude that

y:5 lim inf T(n), (2.12)

n_.

which implies that T(n) > y as n and completes the lemma with P

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 785

Our next lemma tells us that from any n points in [0, ill we can always find a pair of points that are relatively close together. Even though this is a wellknown fact, we present a brief proof here for completeness.

Lemma 2.2. There is a constant Cd such that for any {xl, x2.... y Xn } C [0, 11d, where n ~t 2, one has

1 xi xi Cdn 11d (2.13)

for some xi and xj, where 1 :::: i < j n.

Proof. We proceed using a simple packing argument. Cover each point xi with a dball of radius r and center xi. Such a ball has a volume of wdrd, where wd is the volume of the unit dball. Suppose that all these balls were nonintersecting. Then, each ball would cover at least 2 d £odrd of [0, 11d (even if the ball was centered at a corner of the hypercube). The n balls would cover at least 2 d 0)drdn. Since [0, 11d has unit volume, we have 2 d £Odrdn 1. Hence, Xi xj 1:5 4(od 11d n "d for some xi and xj, where 1 ::~ i < j:5 n. This proves the lemma with Cd = 4wd 11d.

One can improve the constant of this lemma, but the simply derived constant Cd is sufficient for our purposes.

In the next section we partition the dcube to construct a point set that attains a nearly worstcase greedy matching. This will give us an initial recursion that is a first stab at the second target recursion of Lemma 2. 1.

3. CONSTRUCTION OF A NEARLY WORSTCASE GREEDY MATCHING

The asymptotic analysis of p. will be obtained by giving a recursive construction that demonstrates that pg satisfies the second target recursion of Lemma 2. 1. Although the construction itself is easy to lay out, the verification that the construction suffices will require a rather detailed understanding of the combinatorial geometry associated with Euclidean greedy matchings.

We begin the recursive construction by dividing the dcube Q = [0, 11d into m d equally sized cells Qj, where 1:5 i:5 Md and each cell has side length 11m. The union of the boundaries of the Qi will be denoted by H; in other words,

d

H= U mi= 1 a Qi. Also, for & satisfying 0 < 8 < 11m, let H& denote the set of points of [0, 11d that are within 8/2 of H; in other words, H' is the grating H fattened to a width of 8. From this, we can define a subcell & of each Qi by &

Qi H8.

Now, by the definition of pg (k), for all E> 0 there is a set of k points S= {Xl, X2, . . . , Xk} in [0, 1]d for which the greedy algorithm yields a matching of total weight at least pg(k) e. By scaling and translation, we can therefore produce for each 1 :5 i:5 Md a set Si C QP of k points such that the greedy algorithm applied to Si can produce a matching of weight at least (m' 8)(pg(k) E), since each side of & has width (m` 8). To do so, the

i

i

i i i

i 1 1

786 SNYDER AND STEELE

greedy algorithm must break ties to form an ordering that yields a maximumweight greedy matching on the point set Si. Let Mi denote this greedy matching. Next, we will construct a specific greedy matching M of the mlk points

U_d

i=l Si that is closely related to the matchings Mi of the individual Si. The proof that pg satisfies the second target recursion of Lemma 2.1 will come from building a relationship between the total weight of M and the total weight of the collection of the Mj, where 1:5 i:5 Md.

We form M by applying the greedy heuristic to a special ordering 0 of the edges of the complete graph K on the m'k points in [0, 11'. To begin the definition of ~, we first place the edges of K in order of nondecreasing Euelidean length, and we resolve any ties that may occur by the following sequential process.

The first step in the tie resolution process focuses on the edges that lie entirely within subcells; in other words, all edges e such that e is contained entirely in the union Ui=d1T. For any set of edges of K that have the same length A, i.e., for

& = {e E K: 1 e A, e C for some 1:5 i:s m'},

the edges of Ekare first ordered according to the index of the subcells in which they are contained. Thus, the edges of Q11 appears first, the edges of Q21appear second, and so on, concluding with the edges of Q'd.

M

The ordering o, of the edges of K will be completely determined once we specify an ordering of E, n {e: e C Q.~} for each 1:5 i 5 m' and for each A such that A= {1 e 1 : e is an edge of K}. To do this, we recall that each matching Mi of Si C & is precisely the greedy matching corresponding to a given ordering o,i of {e: e C Q.~}. Since elements of E, n {e: e C QP} can appear in the same order in u as they do in aj without violating the nondecreasing ordering of 0, we choose for each i and each A the ordering of Ekn {e: e C QP specified by o.i to complete our specification of a.

The main consequence of this construction is that if e and e' both are edges

U_d of the complete graph K on i= 1 Si, and if e C & and e' C Qi@ for some 1 j:5 Md, then e precedes e' in the ordering a if and only if e precedes e' in the ordering oi. In addition, if e C & and e' C Qj' for i ~ j, then if 1 e 1 < 1 e'i, we have e preceding e' in the ordering a. Finally, if e C Q,~ and e' C Qj3 for i :k j and if 1 e 1 = 1 e'i, then e precedes e' if and only if i < j.

We now consider the process of simultaneously forming the matchings Mi and M using the orderings ui and o, respectively, where 1 s i:5 m'. Consider first a smallest edge e of o, and note that if 1 e < 8, then e must belong to M and also to some Mi. Proceeding successively, we note that any edge e that satisfies 1 e 1 < 8 and is chosen for some Mi must also be chosen for M, by the construction of the ordering a. Consequently, we arrive at the basic inclusion

Umd UMd

i=,MiCMU {e:eEE i.1Mi and jel ~t 8}, (3.1)

which tells us that the associated edge weights must satisfy

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 787

m d(M1 3)~(pg(k)E):5 1 lei+ 1 lela:5p,(Mdk)+ 1 lei, (3.2)

eE=M eEF cEEF

where F = U_d {e: e EE Mi and 1 e 8}. The second, inequality in (3.2) comes from the fact that p'(Mdk) is an upper bound on any greedy matching Of Md k points in [0, 11d.

To turn the key recursion (3.2) into the inequality that can serve to justify the second target recursion of Lemma 2. 1, we need to find an upper bound on the total weight that a greedy matching can assign to its long edges, such as the members of E This is the task of the next section.

4. A BOUND ON LONG GREEDY EDGES

In this section, lemmas leading to a bound on the number of relatively long edges of a greedy matching in [0, 11d are developed. The first lemma shows that there is an a priori bound on the total weight of any subset of k edges of a greedy matching.

It will be convenient henceforth to use L.(E) to denote the sum of the weights 1 e 1' for all e E E, where E is any subset of the edges of the complete graph on V,.

Lemma 4.1. There is a constant C3 such that any subset E of the edges of a greedy matching Of {Xl, X2,. . , X0 C [0, 1]d satisfies

L.(E):s c31E1 (da)ld. (4.1)

Proof. Let s = In/21. We label the edges el, e2, . e, of a greedy matching M by the order in which they are chosen by the greedy algorithm that forms the matching. If we let k = 1 E 1 and eil , eh, ... 5 eik be the edges of E, then by the'relabeling of the ei we have that

k

L~(E) 1 ei, 1:5 lei,il (4.2)

j=1

For 0 25 i:5 s 1, before ei, 1 is chosen by the greedy algorithm for inclusion

into M, there are exactly n 2i points Of {Xl, X2, X0 yet to be matched.

By Lemma 2.2, we thus have

S1

iei+l la:sca (n2i) ald

2

isk

k

:s c2 (2i) ld

k

:5 c2a2 ald ( 1 + fl X ald dx) (4.3)

(da)ld

:5 C3k

i

i i

i i

788 SNYDER AND STEELE

where C3 = dl(d a) 21dC.

2.

Although we will not make use of the following fact, one should note that Lemma 4.1 cannot be significantly improved. In particular, if n points of [0, 11d are laid out in a regular hexagonal lattice, then any set of k edges of any matching will have weight on the order of kn ald' which is fl(n (d)1d) when k is of order n.

Our next lemma bounds the number of edges of Euclidean length x or greater in a greedy matching of n points of the cube. We first define v,(x, n) to be the largest j such that there exists some point set Yn = {Xl, X2.... ~ Xn} having a greedy matching with j edges greater than or equal to x in Euclidean length, i.e.,

vg(x, n) = max maxl{e E M: 1 e 1:~: x and M is a greedy matching of Vjj. V~C[O,lld M

IV.1~n (4.4)

The internal maximum, of course, is necessitated by the possibility of more than one greedy matching of a given point set V,.

An intriguing fact is that P,(x, n) can be bound independent of n, the number of points.

Lemma 4.2. There is a constant C4 such that, for all x > 0,

vg(x, n) :5 C4X d. (4.5)

Proof. Let M be a greedy matching Of {Xl, X2.... ~ Xn}, and set Om(x) =

l{e EE M: 1 e 1 ~t x}l. Label the edges of M having length x or greater in the order of their selection by the greedy algorithm as el, e2, . eom() 7 i.e. ,

el 1 :S 1 e2 1 :5 ... :S 1 e,5m(.) 1.

By the nature of the greedy algorithm, for any l:s i:s Om(x), we can choose an endpoint p of ei so that a sphere of radius x centered at p contains no members of the set of endpoints of the ep where i + 1 .s j:5 (pm(x). Therefore, if we consider spheres of radius x/2 for all ei, where 1:5 i:s om(x), then none of the spheres can intersect.

If o)d is the volume of the unit sphere in [0, 1]d, then we have (Pm(x) spheres, each of volume cod(x12)d. Consider the sphere associated with the edge ei, and let Ai be the portion of sphere that intersects with [0, 11d. If g(Ai) is the volume of Aj, then 10m(X) ii(Ai) ~ OM(X)Wd2d(X12)d, where the quantity on the right assumes that each Ai is centered at the corner of the dcube. This yields OM(X) Wd2d(X12)d 1, which proves the lemma with C4 = 4dl£od.

Using this bound on the maximum number of long edges along with the bound of the previous lemma on the total weight of any subset of k edges in a greedy matching, we are now in a position to exploit the relationship between M and the Mi given by Inequality (3.2). To bound the total weight of the edges

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 789

in the set F= U" {e: e E Mi and 1 e 1 ~t 8}, we note that Mi is a matching of k

i~'

points in a cube of side length m` 8. By scaling, we see from (4.1) that

1 lel~C3(M'6).1{e:eEMiandlel :::m. 6} 1 (d a)ld. (4.6)

eE=Mi

lel:~8

Now, if we let A,(x, n) be the maximum number of edges of length at least x in a greedy matching of n points in [0, t11, then we have the identity

f~,(tx, n) (x, n). (4.7)

Letting t = m 8 and x = 81t, we see from (4.7) and the bound of Lemma 4.2

that

1{e: e E= Mi and 1 e 1 ~: 8}1 :s Ag (8, n)C4 8d(M1 _ 8) d'

so (4.6) becomes

1 lela:5C3C~ a)ld(M1 8)da8ad. (4.9)

eeMi

lel~8

Using (4.9) to bound 71,1 F 1 e 1, we can now simplify our key recursion (3.2) to obtain

m d(M1 8)a(p.(k) E) C3C5~ a)ldMd(M1 8)d8d pg(Md k), (4. 10)

and, since 8 and E are positive, we conclude that

m d(M1 8)apg(k) C3C5~ a)1d8ad pg(Md k). (4.11)

In Section 6 we will show that 8 can be chosen so that the relation (4.11) satisfies our second target recursion. The next section deals with the incremental rate of growth of pg in order to satisfy our first target recursion.

5. LEMMAS ON GREEDY INCREMENTAL GROWTH

The first of the two target recursions required by Lemma 2.1 is a bound on the incremental rate of growth of the function pg. The next lemma establishes an incremental rate of growth "by two's," while allowing for a nonmonotonicity of pg.

Lemma5A. There is a constant cs such thatfor all 0

pg(n + 1) t5 pg(n 1) + csn ald, for all n ~t 1; (5.1)

790 SMER AND STEELE

and

pg(n + 1) :s pg(n), for all n:=t 2, where n is even. (5.2)

Proof. Let S be an arrangement of n + 1 points in [0, 11d such that a greedy matching M,, of S has weight La(Mn,1) zi p, (n + 1) e, where E > 0. Let {xi, xj} be a shortest edge of the complete graph on S. By Lemma 2.2 and the nature of the greedy heuristic, {xi, xj} EE Mn + 1 and Ixi xj 1' 5 c21 (n + .1d an~ald.

c2

To prove Inequality (5.1), delete both xi and xj and obtain the matching M, which is Mn+l with the edge {xi, xj} removed; we then have that

p. (n + 1) E L a (M) + 1 xi xj 1 a:5 La (M) + c2 n ald. (5.3)

In addition, since {xi, xj} is an edge of minimum length, M is a greedy matching of {X,, X2, . . , Xi 1, Xi+ 1, . . . , Xj 1, xj+ 1, . . . , x,}; hence, we have that La(M) :5 pg (n 1). Inequality (5.3) then becomes p, (n + 1) 6:5 p. (n 1) + c2`n ld . This is true for any E > 0, proving (5. 1) with c5 = c2a.

To prove (5.2), we first suppose that n is even and again let S be an arrangement of n + 1 points with a greedy matching Mn,l of weight La(M,1) k pg(n + 1) E, where E > 0. Since n + 1 is odd, then remains exactly one exposed point x* of S following the formation of M,,. We delete this exposed point and form a greedy matching Mn on the remaining points of S such that Mn satisfies La(Mn) = max{La(M): M is a greedy matching of S {x*}}.

Now, since there exists an ordering o of the edges of the complete graph on S {x*} such that the greedy algorithm pplied to u produces M,,+, (this can be accomplished by deleting the edges incident with x* from the ordering used to form Mn+l on S), we see that La(M,_J:5 La(MJ. Consequently, we have

pg(n + 1) E La(M,1) :5 L~(Mn) pg(n). (5.4)

Since (5.4) is true for all e > 0, the proof is complete. a

We remark that the case distinctions of Lemma 5.1 are unavoidable since p,(n) is not a monotone sequence.

Because Lemma 5.1 only gives us an incremental rate of growth of the function p. "by two's," we will later see that we can apply Lemma 2.1 only to greedy matchings with powerweighted edges of sets of points of even cardinality. This problem is resolved by the following lemma, which allows us to compare the weight of a greedy matching of a set of points with the weight of a greedy matching of the same set of points augmented by an additional point.

LemmaS.2. Let M,,+, and M,, be greedy matchings using the weighting function w (e) = 1 e 1' Of Sn + 1 = {Xl, X2.... ~ Xn+l} and S. = {Xl, X2, XJ, respectively, where xi E= [0, 11d for 1 :5 i:5 n + 1. Then,

1 L. (M,, + 1) La (Mj 1:5 d~n. (5.5)

Proof. Order and label the S = (n, 1) edges of the complete graph On Sn11

2

so that 1e11:5 1e21 ':5 lesi.

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 791

We can iteratively and simultaneously form both M,, and M, by considering each edge ei, where i = 1, 2,. . . , s, for inclusion in both M,,, and M,,. We begin by initializing M,, and M, as empty sets. If both endpoints of ei are exposed in M,,, then ei is added to Mj; in addition, if both endpoints of ei are exposed in M, and neither endpoint is the point x,+,, then ei is added to M..

On completion of this algorithm, we claim that exactly one of the following properties is true for each ei:

(1) ei EE M,, and ei E= M,;

(2) ei E~ M,, and ei E~ M~; or

(3) ei belongs to a monotone alternating path of M, and Mn originating at X,,+i, where a monotone alternating path P of Mn and M,, is a path of length at least one whose edges are monotonically nondecreasing in length and alternate from membership in one matching to membership in the other as each edge of the path is traversed. The originating edge of P with x,+, as an endpoint is an edge of minimum length in the path. (For a more formal definition, see Snyder [111)

We prove the claim by induction on i. For i = 1, if Xn11 is not an endpoint of el, then Property (1) is trivially satisfied. If xl is an endpoint of el, then el is itself an alternating monotone path originating at Xn+l and Property (3) is true, so we assume the claim is true for all ej, where j < i, and consider the edge ei.

If neither (1) nor (2) is satisfied, then we have two possibilities. In the first case, suppose ei E= M, and ei T: Mn + 1. Edge ei can fail to be in Mn + 1 only if one of its endpoints is the endpoint of some edge ej E Mn + 1. By the nature of the greedy algorithm, j < i and 1 ej 1 :s 1 ei 1, and by the inductive hypothesis, ej is an edge of a monotone alternating path of M,, and Mn+l originating at x,,+,. Hence, ei satisfies Property (3).

The second case occurs when ei E~ M, and ei E= M,,. One way this can happen is when x,+, is an endpoint of ei, in which case (3) is trivially satisfied. The other way in which this can occur is when x,,+, is not an endpoint of ei. In this case, (3) is satisfied using the same argument we used in case one, proving the claim.

To complete the proof, we note that the only edges contributing to IL, X

(Mn+l) L.(Mn)l are those satisfying Property (3). Since by definition the monotone alternating path P originating at Xn+l must contain at least one edge of M,,+,, there can be only one such path by the definition of a matching, and since P is monotone,

I L.(M,,+1) L.(Mn)l lei~ 1 jeja

eE=P eEEP

eem~+i eE=M~

:5max{iel : eE=P}5d"2, (5.6)

completing the proof. E

792 SNYDER AND STEELE

A lemma that is closely related to Lemma 5.2 with a= 1 is given in [3], where it is used in a probabilistic context.

6. GREEDY ASYMPTOTICS

We now use the combinatorial and geometric lemmas thus far established to show that p, satisfies the target recursions of Lemma 2.1. We first note that since 8 < 1/m, we have that 8~d < M a 8d. Inserting this bound into the key relationship (4.11) and expanding the quantity (m' 3)~ gives us

m da p, + O(Md+2

,(k) M"'~8ap,

~(k) 32) pg (k)

da '8d p, (mdk). (6.1)

C3C4 m

We then make the choice

8 = 1 a 11(d+ 1Wa d)ld(d+l) (6.2)

m

to show that there exists a constant C6 such that

mdap.(k) C6Mdaa dl(d+ 1) k(d )1(d + 1) < pg (Mdk) (6.3)

for all m and k greater than or equal to one. Since 0 < a < d, we see that our choice of 8 satisfies the requirement that 0 < 8 < 1/m. Moreover, the recursion (6.3) satisfies the second target recursion of Lemma 2.1 with the remainder function r.(k) = C6 adl(d+ 1)k(ad)ld(d+l), which goes to zero as k > since a is fixed and a < d. This completes our main task of satisfying the principal requirement of Lemma 2.1.

All that remains is to establish the incremental rate of growth condition, target recursion (2.1(a)). Since Lemma 5.1 gives an incremental rate of growth "by two's" for p, we define the function ip, where tP(n) = p,(2n), and show first that ep satisfies (2.1(a)) in addition to the main target recursion (2.1(b)). We note that qi is just P. restricted to the even integers. Exchanging 2k for k in Inequality (6.3) yields, in terms of qj,

md.tp(k) C6 Md aadl(d+i) (2k) (da)l(d+1).::Z lp(Md k), (6.4)

so ip satisfies the second target recursion with the *slightly modified remainder

(da)l(d+ dl(d+ ad)ld(d+l)

function r,(k) = 2 1)C6a 1W We next write the incremen

tal growth inequality of Lemma 5.1 as

p. (2n + 2) :5 pg (2n) + cs (2n + 1) &d, (6.6)

or in terms of 0,

tp(n + l):5 #P(n) + 2Idc5nald. (6.7)

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 793

Using Lemma 2.1, this proves that

tP(n) ~gn (d .)ld (6.8)

as n for some constant ~g = ~g (d, a).

Finally, to prove Theorem 1 for the function pg, all we need to do is apply Lemma 5.2. Let {x,, x, . . . Y X2n } C [0, 11d be a set of points with a greedy matching that achieves a total weight greater than or equal to p,(2n) E, where E > 0. If we add an additional point and form a greedy matching M of the 2n + 1 points, then from Lemma 5.2 and the definition of p,(2n + 1), we have that

pg(2n) E d'12 L~(M) pg(2n + 1). (6.9)

Also, since E> 0 and since 2n is even, Inequality (5.2) of Lemma 5.1 tells us that

pg(2n) d&2 pg(2n + l) :5 pg(2n). (6.10)

Applying the relation (6.8) to pg(2n) and letting n go to infinity proves that

pg(n) P9n (d a)ld as n > , concluding the proof of Theorem 1. E

A final issue that needs to be resolved is a comparison of the quantities fig(n) and pg(n). By using a special tiebreaking perturbation of point sets of [0, 11d, the next three sections prove Theorem 2, which tells us that, remarkably, the sequences fig(n) and pg(n) are identical.

7. BLOCKS AND LEGAL SHUFFLINGS OF EDGES

One obvious benefit of our analysis of pg in the preceding sections is that it gives us the exact asymptotic weight of the worst possible greedy matching of n points in the unit dcube. If we recall the definition

pg(n) = sup max{L,(G): G is a greedy matching of VJ, (7.1)

V~C[O,lld G 11V.17.

we see that pg(n) represents a supremum over all possible point sets of a greedy matching of n points that is formed by using a "worst possible" list of candidate edges for processing by the greedy algorithm. In other words, pg can be associated with an algorithm that always yields a maximumweight greedy matching of a given set of points.

In practice, this may not be the case since ties among lengths of edges in the candidate list are usually resolved in an arbitrary or random manner. Let us suppose, then, that we instead use an optimal greedy algorithm, i.e., one that resolves ties such that we always obtain a minimumweight greedy matching. A natural question to ask is, What is the worstcase performance of this algo

794 SNYDER AND STEELE

rithm over all nsets? The definition of fl,(n) embodies this scenario by replacing the inner maximum in the definition of pg(n) with a minimum.

The next lemma will be useful in resolving this issue. It guarantees that we can perturb any point set having a greedy matching M to obtain a point set with a greedy matching that is virtually identical to M yet exhibits no tied edges, as is expressed by the properties (i), (ii), and (iii) below. Despite the intuitive simplicity of these properties, the proof of the lemma is much more delicate than we would expect at the outset.

Lemma 7.1. For any E>0 and any greedy matching M of S=

{Xl, X2, . . . , X~} C: [0, 11', there exists a point set S' = {xl, x2 x'} C [0, 11d

with an ordering o' of the edges of the complete graph on S' such that, if the

greedy algorithm applied to o,'forms the matching M', then

(i) The edge {x ~,, x ~} E1 M' if and only if the edge {xi, xj} E: M; (ii) L~(M') = minG{L. (G): G is a greedy matching of S% and

(iii) 0:5 L~(M) L~(M') < e.

Proof. Let K= (S, E) be the complete graph on the point set S, and let Om be an ordering of the edges of E that gives rise to the greedy matching M. There is a k such that we can partition o into k contiguous sublists or blocks o,l, o2, . . . , ok such that, for any edges e E oi and e' EE oj, where 1 :5 i:5 k and 1 t~ j:5 k,

I e I < I e'i, if i

jel = le'i, if i = j; and (7.2)

jel > le'l, if i>j.

Thus, we have a block for each distinct Euclidean edge length of the edge set

E, and, within any block, all edge lengths are equal. In each block, we mark

the edges that belong to the greedy matching M and say that the edges not

belonging to M are unmarked. Next, for each i such that 1 s i:~ k, we label the

marked edges of the block oi in the order in which they appear in the block as

eil , ei2 , ... , ei.(,,), where m ((Ti) is defined to be the number of marked edges

block 1 block 2 block k

1C111. 1e321 ... 1 cl,.(") 1 ... 1 ... 1C21 IIC221 ... 1C2,.(,)1 ... 1 ... _1"_'1Ckli ... ICk21 ... ICk.(,,)

FIG. 1. The ordering 47 and its marked edges.

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 795

block 1 block 2 block k

1C331 ei21 lel.(.,)1 11 e21 1 e22 J 1e2,.(.,)1 1 ICkliek21' 1 ekm(&7i) 17, .1

U2 ak

FIG. 2. The ordering o,'; marked edges of o, appear first in each block.

of oi; in other words,

m (oi) e: e EE ai and e E M}1. (7.3)

Figure 1 illustrates the ordering a and its marked edges.

We now investigate ways of reordering the edges of o to form a new ordering o' without affecting the matching obtained when processing these lists of candidate edges with the greedy algorithm. First, define omi' to be the ordering ei, , ei2 , . . ei.(,) followed by the set of unmarked edges of uj, where the latter set is arranged in an arbitrary order as shown in Figure 2.

Now, all the marked edges that preceed a given unmarked edge e of 0, still precede e in the new ordering u' = ul, o2 . . . . . uk, and since the relative orders of the marked edges in o, and o' are identical, the matching obtained by applying the greedy algorithm to o' will contain only marked edges. This means that the greedy algorithm applied to o' will still form the matching M. This fact remains true for any permutation of the set of unmarked edges of each oi, where 1 :5 i:5 k. In addition, it remains true even if we reassign the weights of the marked edges in such a way that the order of a' is retained without violating the nondecreasing order of the edge weights, This fact will prove to be useful in the next section, in which we investigate a perturbation that satisfies the three properties of Lemma 7.1.

8. A PERTURBATION THAT BREAKS GREEDY TIES

We now set out to perturb the point set S to form the point set S'

{xl, x2 x,'}, where, for 1:5 i 25 n, the point x ~, EE S' is the point xi EE S after

the perturbation. For any edge e = {xi, xj} of u, we define e' = We also

let the gap Ajj, l be the absolute difference between the Euclidean edge lengths

corresponding to the two adjacent blocks iTi and uj, 1, i. e., if e EE oi and j EE oi, 1,

then for 1:5 i s k 1,

Ajj+1 e > 0, (8.1)

and, if EE o1, we define

A0,1 e > 0. (8.2)

796 SNYDER AND STEELE

We then define A to be the minimum gap by setting

A = min Ajj, l. (8.3)

O~i~sk 1

Note that, by the definition of AO,1, the minimum gap A is no larger than a minimumlength edge of o.

We are now ready to describe the perturbation. Consider the block ai of the order o,'. For each marked edge eij, where 1 j:s m(oi), for notational convenience we relabel the endpoints of eij so that eij = {yj, zj}. We then perturb the points zj according to the following rule:

For all j satisfying 1 :5 j:s m (oi), move zj toward yj a distance of

1 j ) 8( E), (8.4)

8ii = ( m(oi) + 1

where the parameter 8(E) satisfies 0 < S(E) < A and is to be chosen later.

We can note that our entire perturbed point set S' = {xl', x2 x,'} is con

tained in the unit dcube since the perturbation is accomplished by moving

points a strictly positive distance along the marked edges of the complete graph

on S, each of which is contained entirely in [0, 11d. We will momentarily see

that no point moves too far.

We now can check that performing this perturbation scheme for all i, where 1:5 i:5 k, will guarantee properties (i)(iii) of our lemma. First, to prove properties (i) and (ii), we note that as j ranges from 1 to m(oi), the quantity 1 jl(m(oi) + 1) ranges from m(omi)l(m(ai) + 1) to 1/(m(oi) + 1); hence, it is always strictly positive and less than one. Since we multiply this quantity by 8(E), which is less than the minimum gap A, we are assured that (1) for any edge eij E uj, the Euclidean length 1 e~,. 1 remains greater than the length, after the perturbation, of any edge belonging to the preceding block 0j1; and (2) since A is at least as small as an edge of minimum length, the distance moved by the point zj never exceeds the length of the edge ieiji. This verifies that S, C [0, 11d . Furthermore, since the reduction in edge length accomplished by the perturbation decreases as j increases, we see that we have retained the order of u~,; in other words, if we label the edges of the ordering 0,' as el. e2, ... ~ eu 9 where U = (n), then ordering the edges of the complete graph

2

on the perturbed point set S' can produce the list el', e2 e U'.

Consider now the formation of the matching M' by applying the greedy algorithm to the perturbed set Y. As the algorithm proceeds, it first considers the marked candidate edges el'l, e12, . . . , el'm(al) of block one. We claim that M' will contain only the marked edges of block one. To prove this claim, observe that changing the length of the edge elj by perturbing one of its endpoints can also change the length of only the remaining n 2 candidate edges that are incident with the perturbed point. This is of no consequence,

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 797

however, since the order of marked edges has been retained by the perturbation and since the inclusion of the edge el'j into the matching M' forces all other edges incident with the endpoints of el'j to be ineligible for inclusion in M' for the remainder of the algorithm. Using an identical argument inductively on the k blocks proves that the edge e' is in the matching M' if and only if c EE M, which is Property (i).

In the preceding section, we noted that the ordering o,' produced the matching M regardless of the order of the set of unmarked edges of the o~,, where 1 :5 is k. Since our perturbation retains the order of marked edges of G', we see that this property is preserved. In other words, arbitrarily reordering the unmarked edges of each di after carrying out the perturbation still produces the matching M'. Since ordering the edges of the complete graph on the perturbed point set S' can produce a list with ties occurring only among unmarked edges (we have perturbed the length of each marked edge so that no ties amongst the marked edges remain), we see that the total weight of any greedy matching of S' equals L.(M'). Property (ii) is a consequence of this fact.

To show that Property (iii) is satisfied, we represent the difference in total weights of M and M' by

k

L~(M) L~(M') eij eL (8,5)

The first sum is over blocks, and the second is over marked edges of the i'th block. That the perturbation parameter 8(E) can be chosen so that Property (iii) holds is immediate from this representation because, as 5(E) >0, we have that 1 e',.,. 1 , 1 eij 1. For an explicit choice Of S(E) for a given E that satisfies Property (iii) of Lemma 7.1, one can consult Snyder [111.

This concludes the proof of Lemma 7.1, which makes brief the proof of Theorem 2 in the next section.

9. A WORSTCASE EQUIVALENCE OF GREEDY MATCHINGS

We now commence with the proof of Theorem 2, which is made simple by Lemma 7. 1. We first define S = {X 1, X2, . . . , x,} to be a set of n points in [0, l]' having a greedy matching M of total weight L. (M) ~t p. (n) E, for E > 0. Let S' be a perturbation of S, and let a' and M' be an ordering of the edges of the complete graph on S' and the greedy matching obtained by using this ordering, respectively, such that properties (i)(iii) of Lemma 7.1 are satisfied.

For convenience, we recall the definition of fig(n):

fig(n) = sup min{L,(g): G is a greedy matching of Vn}. (9.1)

V,C:[0,lld G

IV~l=n

Now, by Property (ii) of Lemma 7.1, by S' we have exhibited a set of n points

798 SMER AND STEELE

in [0, l]' whose minimumweight greedy matching is of total weight L.(M'), so, by the definition of fi,,(n), we have that L.(M):5 P,(n). Since M and M' satisfy Property (iii) of Lemma 7.1, we are guaranteed that

0:s L~(M) L.(M):5 pg(n) E L~(M') < E; (9.2)

hence, we obtain

pg(n) 2E < fig(n) (9.3)

for all n 1. Since this inequality is true for all E> 0 and since fi. (n) is majorized by p, (n), the two sequences must be equal for all n ~t 1, and the theorem is proved.

10. CONCLUSIONS

The asymptotic result given in Theorem 1 combined with the result of Theorem 2 raises some interesting issues concerning the effectiveness of the greedy heuristic for matchings. First recall the bound of Reingold and Tarjan [101, namely, if L(GM) and L(M*) are the weights of a greedy matching and a minimal matching of n points in the plane, respectively, where n is even, then

L (GM) 4 n10921.5 (10.1)

L(M*) 3

In view of (10.1) it is natural to suspect that the greedy algorithm is grossly inefficient. Perhaps we now have reasons to reconsider.

We note first that in addition to establishing the bound of Inequality (10.1), Reingold and Tarjan [101 constructed a collinear configuration of points that attained the worstcase ratio expressed by the righthand side of (10.1). This example point configuration also gives us the ratio of Inequality (10.1) if we pit the worstcase greedy matching algorithm that we have associated with pg against the optimal greedy algorithm that we have associated with fig. Theorem 2 tells us that the discrepancy expressed by respectively exchanging p,,(n) and fig(n) for L (GM) and L (M*) in (10.1) does not exist in the worst case.

We also note that the following theorem of Snyder [111, which is the analog of Theorem 1 for minimal matchings, holds for the sequence

p,,,(n.)= sup min 1 jej':Mis a matching of V,, (10.2)

v~c[O,il' M LEEM

IV~l=n

Theorem. there exists a constant pm depending on the dimension d ~~ 2 and the edgeweighting function w (e) = 1 e 1% where 0 < a < d, such that as n

(da)ld

pm(n) Pmn (10.3)

WORSTCASE GREEDY MATCHINGS IN UNIT dCUBE 799

Using Theorem 1 and this theorem along with bounds on the weights of worstcase minimal and greedy matchings in Avis [11, Supowit et al. [13], and Iri et al. [71, one can observe that the ratio of worstcase greedy to worstcase minimal matchings is a constant as n>, and this constant is at most two Snyder [111.

Hence, in the worst case, in which our matchings are of sufficiently great weight, the greedy algorithm turns out to be equivalent to any minimal matching algorithm, up to a constant factor. This is a reflection of the fact that for Euclidean matchings in bounded regions such as the unit cube, the example of Reingold and Tarjan [101 has a greedy matching that is bounded in weight and a minimal matching whose weight goes to zero as n > . In other words, the example of Reingold and Tarjan is not necessarily one for which the greedy algorithm performs poorly, but it is one for which the weight of a minimal matching is unusually small.

For this reason, Avis [11 has argued that a more reasonable measure of relative performance of matching algorithms in the context of applications such as mechanical plotters is the absolute difference L(GM) L(M*), since this quantity is proportional to the wasted movement of the pen while it is in the "up" position.

Our worstcase scenarios and the example of Reingold and Tarjan [101 represent different ends of the spectrum in terms of the total weight of the heuristic matching obtained using the greedy algorithm. Because of the similarity of our worstcase growth rate to the probabilistic growth rate of Avis et al. [31, it appears that, except for under unusual circumstances, we can expect the greedy algorithm to perform relatively well. It is an open problem to determine exactly what these circumstances are.

Another open problem that has received a great deal of attention is the determination of constants like fig and pm. Although our methods do not lead us down this path, we do note that Theorem 2 gives us a concise necessary condition for a worstcase point configuration for the optimal greedy algorithm: If any set of n points attains the worstcase weight of a greedy matching using an optimal greedy algorithm, then it must produce greedy matchings of identical weight, regardless of how ties are broken. This easily investigated condition may be of assistance in determining worstcase arrangements of points.

ACKNOWLEDGMENT

The exposition in this paper has profited from the assistance of an unusually careful referee. The proof of Lemma 4.2 is based on a suggestion of the referee, shortening a proof that appeared in an earlier version.

This work was supported in part by grants NSF DMS8812868, NSA M13A 90489H2034, AFOSR890301, DAAL0389G0092, and Georgetown University 1989 Summer Academic Grant.

800 SNYDER AND STEELE

REFERENCES

[11 D. Avis, Worst case bounds for the Euclidean matching problem. Comput. Math. Appl. 7 (1981) 251257.

[2] D. Avis, A survey of heuristics for the weighted matching problem. Networks 13 (1983) 475493.

[31 D. Avis, B. Davis, and J. M. Steele, Probabilistic analysis of a greedy heuristic for Euclidean matching. Probability Eng. Informational Sci. 2 (1988) 143156.

[41 N. Christofides, Worst case analysis of a new heuristic for the traveling salesman problem. Technical Report, G.S.I.A., CarnegieMellon University, Pittsburgh, PA (1976).

[5] J. Edmonds, Paths, trees, and flowers. Can. J. Math. 17 (1965) 449467.

[61 E. Gabow, An efficient implementation of Edmonds' algorithm for maximum matching on graphs. J. Assoc. Comp. Mach. 23, (1976) ~2l234.

[7] M. Iri, K. Murota, and S. Matsui, Heuristics for planar minimumweight perfect matchings. Networks 13 (1983) 6792.

[81 M. Iri and A. Taguchi, The determination of the pen movement of an XYplotter and its computational complexity (in Japanese). Proceedings of the Spring Conference of the O.R. Society of Japan, P8 (1980) 204205.

[9] E. L. Lawler, Combinatorial Optimization: Networks and Matroids, Holt, Rinchart, and Winston, New York (1976).

1101 E. M. Reingold and R. E. Tarjan, On a greedy heuristic for complete matching. SIAM J. Comput. 10 (1981) 676681.

[11] T. L. Snyder, Asymptotic worst case lengths in some problems from classical computational geometry and combinatorial optimization. Ph.D. Thesis, Princeton University, Prineeton, NJ (1987).

[121 J. M. Steele and T. L. Snyder, Worst case growth rates of some classical problems of combinatorial optimization. SIAM J. Comput. 18 (1989) 278287.

[131 K. J. Supowit, E. M. Reingold, and D. A. Plaisted, The traveling salesman problem and minimum matchings in the unit square. SIAM J. Comput. 12 (1983) 144156.

Received May 1988

Accepted December 1989

1