Operations Research Letters 8 (1989) 137142 June 1989

NorthHolland

COST OF SEQUENTIAL CONNECTION FOR POINTS IN SPACE

J. Michael STEELE*

Princefon University, Program in Statistics and Operations Research, School of Engineering and Applied Science,

Prineeton, NJ 08544, USA

Received July 1988

Revised December 1988

A bound is given for the cost of the spanning tree produced by the sequential minimal insertion procedure as applied to n points in the unit dcube. The technique developed is reasonably general and can be applied to several other problems of computational geometry, including the nearest neighbor heuristic for the traveling salesman problem. Attention is also given to bounding the sum of the powers of the edge lengths of sequentially constructed trees and paths. Examples illustrate that the bounds obtained are of best possible order as a function of the number of points.

spanning tree * minimal spanning tree * nearest neighbor heuristic * sequential insertion * traveling salesman problem

1. Intiroduction has been expended to determine good values of

the constants ad and ad'. For example, Verbluns

This article develops a technique to provide ky (1951) showed % <, (2.8)1/2, and Few (1955)

sequentially constrained analogues to some classi improved this to % < 21/2 . The first bounds on ad

cal inequalities of combinatorial optimization in for general d are given by Few (1955), and these

Euclidean space. In particular, the technique pro have been improved by Moran (1984), Goldstein

vides sequential analogues to the following: and Reingold (1988), and Goddyn (1988). If one

(1) If S Xl, X2 Xn } C: 10, Ild and 1 xi does not press for the best values of a and a'

d d5

xj 1, denotes the Euclidean distance between x, bounds (1.1) and (1.2) are both easy consequences

and xj, then TSP(S), the minimum cost of a path of the pigeonhole bound;

through the points of S, satisfies There is a constant #d > 0 such that for any

1)1d Melement set

TSP(S) 1< adn(d (1.1)

where ad is a constant that depends only on the S Xl, X2 Xm } C: [0, 11 d'

dimension d. the function defined by

(2) If MST(S) denotes the cost of the minimal

Euclidean length spanning tree of S = {xl, 9. (xl, X2,, XJ = Min 1 Xj Xk

X2,. X0 C [0, 11d, then satisfies

(d 1)1d

MST(S) < a'n (1.2)

d 11d

where the (smaller) constant a' again depends 9. ( Xl, X2, X. ) 1< Pdm . (1.3)

only on the dimension d. d The problem that motivates this article is the

Inequalities (1.1) and (1.2) have been the focus derivation of an analogue to (1.2) for the heuristic

of many investigations and considerable ingenuity construction of a spanning tree of S= { xl,

X2., XJ that is based on minimal sequential

insertions. It turns out that one has the precise

analogue of (1.2), and, moreover, an a priori in

Research supported in part by National Science Foundation equality analogous to (1.3) is valid in an average

Grant no. DMS88128668. sense. Before developing that a priori inequality

01676377/89/$3.50 0 1989, Elsevier Science Publishers B.V. (NorthHolland)

137

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

(which is somewhat technical in appearance), we Saxe (1980). The final section points out some

examine some of its consequences. open problems.

Given a fixed sequence S = (x,, X,__ XJ of

points in R d, then we can build a spanning tree

for S by sequentially joining x, to the tree formed 2. Sequential insertion trees

by {x,, x2_. , x,,} for 2 < i ~< n. If r, =

r(xi; Xl, X2 xi 1) denotes the shortest dis The following lemma provides the basis for

tance between xi and a point of {x,, X2, many inequalities involving ri. The key point is

i.e.

that the righthand side of (2.1) does not depend

r, = r(Xi; X,, X2, Xi1) = min 1 Xi Xj on n and the lefthand side is almost the distribu

jA_

then r, is the minimal cost of joining xi to a vertex given to obtaining a simple bound than to obtain

of a spanning tree of (Xl, X2,.x,,}. Natu ing the sharpest possible constant. Although, the

rally, r(xi; Xl, X2., Xi1) is at least as large as constant is irrelevant for most applications, it

gi(Xl, X2, Xi) for each 1 ~< i < n, and by easy would be of interest to know if the righthand side

examples one can show r can be much greater of (2.1) could be replaced by c d for some c not

than g. The cost of the minimal insertion tree depending on d.

constructed by sequentially connecting x, to the

tree already constructed on {xl, X2, .' Xii} is Lenuna 1. If {xl, X2,. ,XJ C [0,1]d and

given by the sum of the r,, 1 < i < n, and will be r,= min Jxjxl forl

are all at least as large as g, it turns out that a

bound is available for the sum of the ri that is not then for any 0 < x < oo we have

substantially larger than the bound available for d/2

MST(S). Specifically, there is a constant, yd, de r," <, 8dd (2.1)

pending only on the dimension d such that for all x~ri<2x

n > 1, we have

Proof. Let C = (i: x < ri < 2x} and for each i E= C

MIT(S) r(xi; Xl, X2,. Xi1) let Bi be a ball of radius 4r, with center x,. We

1 < i

ydn i

Each of the inequalities (1.1), (1.2), and (1.5) rj <, 2x gives us

says that the average cost of each edge is of order 1 xi xi ZW, + ri) < X. (2.2)

n 11d. The twist in (1.5) is that one maintains the

same order of magnitude of average cost even But, by definition of rj we have 1 x, xj 1 >, rj for

when one is required to insert the xi into the tree all i

The next section develops a general method for 1 x, x, 1 >, x. Since 1 xi xj x contradicts

obtaining bounds in'sequential problems, and, in (2.2), we have Bi n B, = 0.

particular, a proof is given of (1.5). The third Now, since all of the balls Bi are contained in a

section modifies the basic method to provide sphere with radius M1/', the fact that they are

bounds for the nearest neighbor heuristic for the disjoint tells us the sum of their volumes is

traveling salesman problem. In Section 4, the sum bounded by the volume of the sphere of radius

of the dth powers of the edges of the minimal 2d 1/2 . Thus, if 4Wd denotes the volume of the unit

d

insertion tree are contrasted with those of the ball in R we have the bound

minimal spanning tree and some surprising dif ddd12

ferences emerge. The fifth section draws out the Y1 Wd rid4 1< Wd 2

connections and distinctions between the present ies

technique and a related method of Bentley and from which (2.1) follows. E]

138

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

Information is easily extracted from (2.1) if one Proof. Setting IP(x) = 1 1x and [a, 81 n Ild,

is guided by the theory of linear programming dll'l in (2.3) gives us

where duality tells us that essentially all inequali gddl/2 1/2) 1 11d

ties that can be inferred from (2.2) are obtained by ri' < 2 (log(d logGn W

taking linear combinations of (2.2) with positive n11d < r, (2.6)

coefficients. This principle is made explicit by

inequality (2.3) and its applications to the proof of and the small values of ri d are trivial to bound

(2.5) and (1.5). d

The restriction in Lemma 2 to nonincreasing ri (2.7)

functions is convenient, but inessential. Similar ri < n 1d

inequalities can be derived for t that satisfy less so (2.5) follows from (2.6). E]

taxing regularity assumptions.

Inequality (2.5) can be used with H61der's in

Lemma 2. f is a positive and nonincreasing func equality to provide a bound on the basic sum of

tion on the interval (0, d 1/21, then for any 0 < a < (1.5), Er,, but in this instance the bound one

1/2

d obtains is not the best possible. Specifically,

r,"' t(r,) < 2 . 81d1/2 P t(x) dx. applying H61der's inequality with p = d and

q = dl(d 1) to E"=, ri 1, we find

(2.3) n )11d.

E r, <

cild (dl)ld

d n (log n (2.8)

Proof. By (2.1) we have

ddd12, This bound falls short of the desired inequality

rdI(x < ri < 2x) < 8 (1.5) by a factor of (log n )11d; so to prove (1.5),

is we use (2.3) directly. Letting P(x) = 11x d and

where I(x < r, < 2x) = I('r, < x < ri) is the indi 11d 1/21

1 (a, p)= [n , d in (2.3), we bound the sum

cator function. If we multiply by *(x) and in of the larger r,,

tegrate over 1.1a P], we find

rd 4dd12 Ild ri < 2 .8 ddd12 (d 1)

f dx < 8 f ~k(x) dx. r,;" ',

a ~~ r,:Q 0 r,12 a/2 ., (2.4) . {2 d 'n (d 1)1d d (d 1 1)/2 (2.9)

Since * is nonincreasing, the integrand on the To handle the smaller ri we again rely on the

lefthand side of (2.4) is bounded from below by trivial bound

11d= (dl)ld

t(rj), so r,.<_n.n n (2.10)

l^r,)r,12 < f ri *(x) dx, rjn~lld

r,12 and from (2.9) and (2.10) we conclude that (1.5) is

and (2.3) follows from (2.4). E] proved with yd = 1 + 2 4ddd/2 1(d 1).

Our first application of (2.3) is to bound the

sum of the dth powers of the r,. In Section 3 it 3. Sharpness of bounds for AW's and MSrs

will be proved that this bound cannot be essen

tially sharpened, even though we show that the The presence of the logarithmic factor in (2.5)

minimal spanning tree satisfies an essentially and the fact that it could be removed from (2.8)

stronger bound. might lead one to question if (2.5) is sharp. Suspi

cions may be further raised by noting that Gilbert

Lemma 3. There is a constant Cd depending only on and Pollak (1968) showed for d = 2 that E 1 e j2 is

d such that for all n >, 2, uniformly bounded for any minimal spanning tree

n of { xl, x2, , x n }c: [0,1]`. It will be proved

r id < Cd log n. (2.5) shortly that (2.5) is nevertheless sharp, but first it

is instructive to look more deeply at the inequali

139

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

ties that hold for sum of powers of edge lengths of Pythagorean Lemma. Given n points in a right

minimal spanning trees. triangle, there is a path through the points such that

The result of Gilbert and Pollak can be ex the sum of the squares of the edges on the path is

tended to arbitrary dimensions by using some bounded by the square of the hypotenuse.

recent results from the theory of spacefiffing

curves. In fact, the proof one obtains for general This lemma can be viewed as an adjunct to the

d > 2 is substantially simpler than the barehanded Pythagorean theorem, and, when n = 3 and the

proof given by Gilbert and Pollak in d = 2. points are the vertices of the triangle, the lemma

We will make use of a theorem of Milne (1980) even reduces to that result. For the details of the

that says there is a function f from [0,11 onto proof of the lemma, one can consult Newman

[0,11d which is Lipschitzian of order 11d, i.e. (1982, p.8); but, if we gloss over some small

points, it is easy to sketch the idea. The first trick

I f(X)_f(y) I

for a constant c > 0. To apply this to the bound the hypotenuse can be joined by a path with the

ing of edge weights of a minimal spanning tree, we required bound. Next, drop a perpendicular from

suppose V,= {Xl, X21 , Xn) C: [0,11d and let T the rightangled vertex to the hypotenuse, and add

be a minimal spanning tree of V Since f is the point that was at the right angle to each of the

su~ective we can choose {yl, y2, , yJ in [0,11 sets of points in the two smaller triangles created

such that f(y) xi. Next, we order the yi to by the bisection. One then applies the induction

given Y(I) < Y(2) * * * < y(,,) and define a new hypothesis and the Pythagorean theorem to com

suboptimal spanning tree T. on V,, by choosing plete the proof.

the n 1 edges (f(y(i)), f(y(i,,))), 1 < i < n. By The fact inequality (2.5) of Lemma 3 cannot be

the optimality of T and the Lipschitz property of essentially improved is a consequence of the fol

f, we find lowing lernma that also shows that (1.3) is essen

tially sharp. As before, Wd denotes the volume of

1 e Id'< E leld the ball of unit radius in R d.

eeT e c= T.

11d

n1 Lemma 4. For each d >, 1, the constant d W,7

1 f (Y(i)) Ay(i+ 1) d has the property that for any { xl, X2 Xn } C

[0,11d there is an Xn11 E= [0,1]d such that 1 x,

n1 X.+i 1 >, 8dn 11d for all 1 < i < n.

d _y(i+l) 1

c Y(i) Cd. (3.2) Proof. If B, is a ball of radius 8dn11d and Center

X,, then by setting A = U,n_ 1Bi we find the com

In comparison with the complexity of the proof plementary set A' has measure at least 1. We can

of Gilbert and Pollak (1968) in d = 2, this proof of thus take Xn I l to be any point in Ac. 13

the uniform boundedness of E 1 e 1 d is almost ef

fortless. All of the geometry of [0,11d has been Returning to (2.5) we can use Lemma 4 sequen

compressed into Ahe existence of the Lipschitz tially to construct {xi, 1 < i < oo} so that for all

spacefilling curve. The use of spacefilling curves to i >, 2,

provide heuristics for a variety of optimization r, > 11d

problems is surveyed in Bartholdi and Platzinan ' 8d(, _ 1) (3.3)

(1988). 1d From (3.3) we see that the order of the bounds

Before leaving the problem of bounding E j e (1.5), (2.1), and (2.5) cannot be improved.

for minimal spanning trees we should note a fact

that makes it possible to give an easy, new proof

of the Gilbert and Pollak result in d = 2 that does 4. Nearest neighbor traveling salesman path

not require the relatively sophisticated machinery

of spacefWing curves. The key observation is given Consider the path through {Xl, X2, . Xn } C

in the following. [0,1]d that is obtained by starting at xl and subse

140

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

quently going to the nearest unvisited point. If we The constants in (4.5) and (4.6) are exactly the

let xj,, Xj2' ' ',,j, denote the ordering of the points same as those in (2.5) and (2.11).

on the path, then the n 1 values defined by

7, nfint IXi,_Xk I: Xk(5~= tXjl, X,, Xj}, 5. BentleySaxe method

< i < n, (4.1)

are the lengths of the edges in the path. Naturally, The method developed in the preceding section

we also have has a close relation to a method developed earlier

in Bentley and Saxe (1980). Two immediate (but

Fi xj, xj,,, 1, 1 < i < n. (4.2) inessential) distinctions between the BentleySaxe

Although the F, are computationally and con method and that developed here are: (1) the Bent

ceptually more difficult than the ri of Section 2, leySaxe method was developed only in d = 2;

we shall find inequalities analogous to (1.5), (2.1), and (2) the BentleySaxe method focused on the

(2.3), (2.5) and (2.7). Since each of these inequali overall cost of heuristic tours and paths rather

ties are consequences of (2.1), we only need to than the cost of sequential connection.

prove the analogue of that one bound. Even in A more important distinction is that Bentley

this step, one sees close parallels to the proof of and Saxe base their bounds is the notion of a

compatible kcircling. If Cl, C2_., C,,, are m disks

Lemma 1. in R 2 , k>,2, and cent(C) denotes the center of

Lemma 5. If S= {Xl, X21 X0 C [0,11d and the disk C, then we say { Ci: 1 <, i < m} is a com

Fi, 1 < i < n, are the edge lengths of the nearest patible kcircling provided

neighbor path, then for any 0 < x < oc we have cent(Ci) E= [0,112, (5.1a)

pd~

, < ddd12.

57 8 (4.3) aCi n [0,112 * 0, (5.1b)

x ~Qri<2x and

Proof. Let C = {i: x < Fi < 2x} and for each i EE C for any ll

the j, are defined as in (4.1). As before, we claim cent(c,,) e nk IC. fails.

that B., n B, 0 for all s < t and we argue by The main lemma of Bentley and Saxe says that for

contradiction. If BnB,00, then any k >, 2 and any kcompatible circling

xi, xi, + P,) < X. (44) {Cl, C,

...... C.} satisfying radius (C,) >, a, for all

Since s < t, we see j, tt {jj, j2, , j, }, so (4.4) 1 < i ~<, m, one has

and (4.1) imply F, < x. But P, < x contradicts the m + 2).

assumption that s . E= C, so we conclude B., n B, = 0 radius(C) < 4(k 1)(1 )/(1Ta (5.2)

for all s, t contained in C. The remainder of the 1

proof is exactly as in Lemma 1. 13 Bentley and Saxe further show that if one places

a circle 6, of radius F, about xj, where

Xil' Xj2~ ~ Xiis the nearest neighbor tour of

It remains only to summarize some conse {X,, X21 1 Xj C [0,1]2 described in Section 4,

quences of (4.3). then S = { e,: 1 < i < n } is a 3compatible cir

cling. Since a subset of a compatible circling is

Corollary. For all n >, 1, compatible, Bentley and Saxe apply (5.2) to ap

n propriately chosen subsets of S to obtain

pd

i < Cd log n, (4.5) Fi < 2~2n/1T + 0(log n) (5.3)

1

and and

n

(d 1)1d p2,

< 1,in (4.6) i < 8 log n + 0(1), (5.4)

1.Qi

1

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

where the implied Constants of the 0terms of theory of subadditive Euclidean functionals. For a

(5.3) and (5.4) can be made explicit. survey that addresses both probabifistic and de

terministic issues, one can consult Steele (1988)

where other open problems are also given.

6. Concluding remarks and open problems

We know by (2.5) and (4.6) that if E is the set Note added in proof

of edges for the minimal insertion tree or the

nearest neighbor tour we have that The Pythagorean Lemma of Section 3 is an

E le 1 d < C log n. (6.1) unpublished result of R. Gomory established in

eE=E 1966. It is cited by R. Adler in his comments on a

Moreover, by Lemina 4 one cannot essentially paper of S. Kakutani (see Collected Works of S.

improve (6.1) despite the fact that by the argu Kakutani, Vol. II, R.R. Kallman (ed.), Birkhauser,

ment of (3.1)(3.2) we have the stronger bound Boston, 1986, p. 444).

57 1 e 1 d < C (6.2) Also the argument in (3.2) is cited by Adler in

eEE d = 2 as due to Kakutani (1966, unpublished).

for the minimal spanning tree.

One intriguing question that is left open is

whether one has the analogue of (6.2) for the References

minimal travelling salesman problem. To see the

subtlety of this problem one first has to note that J.J. Bartholdi and L.K. Platzman (1988), "Heuristics based on

the argument in (3.1)(3.2) does not apply. The spacefilling curves for combinatorial problems in Euchdean

crucial feature of the MST is that the trees one space", Management Sc. 34, 291305.

1 d LL. Bentley and J.13. Saxe (1980), "An analysis of two heuris

obtains are the same for the weight function 1 e tics for the Euclidean traveling salesman problem% Pro

as for the original weights 1 e 1. This feature no ceedings of the Eighteenth Allerton Conference on Communi

longer holds for the travefing salesman problem so cation, Control, and Computing, 4149, University of Il

a new approach is required. One source of help linois, UrbanaChampaign, IL.

might be in Rosenkrantz, Steams and Lewis (1977), L. Few (1955), "The shortest path and the shortest road

through n points in a region", Mathematika 2, 141144.

where one finds several results that complement E.N. Gilbert and H.O. Pollak (1968), 11Steiner minimal trees",

the inequalities of this article. SIAM J. Appl, Math. 16,129.

To motivate the second problem, first define L. Goddyn (1988), Personal communication.

the sequence g(n) by A.S. Goldstein and E.M. Reingold (1988), "Improved bounds

on the traveling salesman problem in the unit cube", Tech

tt (n) = max {MIT(S): S= { xl, X2 P Xn nical Report, Department of Computer Science, University

d }. of Illinois, UrbanaChampaign.

(6.3) S.C. Milne (1980), " Peano curves and smoothness of functions%

(d 1)1d Adv. in Math. 35, 129157.

We have seen that g(n)n is bounded. Is it S. Moran (1984), " On the length of optimal TSP circuits in sets

true that g(n)n (dl)ld converges as n oo? The of bounded diameter", J. Combin. Theory B 37, 113141.

analogous question for minimal spanning trees D.J. Newman (1982), A Problem Seminar, SpringerVerlag,

and traveling salesman tours has been answered in New York.

D.J. Rosenkrantz, R.E. Stearns and P.M. Lewis (1977), "An

the affirmative by Steele and Snyder (1988), but analysis of several heuristics for the traveling salesman

the fact that g(n) is not determined by an opti problem% SIAMJ. Comput. 6,563581.

mality property makes the behavior of g(n) more J.M. Steele (1988), Probabilistic and worst case analyses of

subtle. classical problems of combinatorial optimization in

The inequalities addressed in this article all Euclidean space", Technical Report, Program in Statistics

and Operations Research, Princeton University.

concerned deterministic point sets, but they are LM. Steele and T.L. Snyder (1988), "Worst case growth rates

closely related in spirit (and analytical content) to of some classical problems of combinatorial optimization%

a number of probabilistic topics especially the to appear in SIAM J. Comput.

142