Please see PDF version

Operations Research Letters 8 (1989) 137142 June 1989
NorthHolland
COST OF SEQUENTIAL CONNECTION FOR POINTS IN SPACE
J. Michael STEELE*
Princefon University, Program in Statistics and Operations Research, School of Engineering and Applied Science,
Prineeton, NJ 08544, USA

Received July 1988
Revised December 1988

A bound is given for the cost of the spanning tree produced by the sequential minimal insertion procedure as applied to n points in the unit dcube. The technique developed is reasonably general and can be applied to several other problems of computational geometry, including the nearest neighbor heuristic for the traveling salesman problem. Attention is also given to bounding the sum of the powers of the edge lengths of sequentially constructed trees and paths. Examples illustrate that the bounds obtained are of best possible order as a function of the number of points.

spanning tree * minimal spanning tree * nearest neighbor heuristic * sequential insertion * traveling salesman problem

1. Intiroduction has been expended to determine good values of
the constants ad and ad'. For example, Verbluns
This article develops a technique to provide ky (1951) showed % <, (2.8)1/2, and Few (1955)
sequentially constrained analogues to some classi improved this to % < 21/2 . The first bounds on ad
cal inequalities of combinatorial optimization in for general d are given by Few (1955), and these
Euclidean space. In particular, the technique pro have been improved by Moran (1984), Goldstein
vides sequential analogues to the following: and Reingold (1988), and Goddyn (1988). If one
(1) If S Xl, X2 Xn } C: 10, Ild and 1 xi does not press for the best values of a and a'
d d5
xj 1, denotes the Euclidean distance between x, bounds (1.1) and (1.2) are both easy consequences
and xj, then TSP(S), the minimum cost of a path of the pigeonhole bound;
through the points of S, satisfies There is a constant #d > 0 such that for any
1)1d Melement set
TSP(S) 1< adn(d (1.1)
where ad is a constant that depends only on the S Xl, X2 Xm } C: [0, 11 d'
dimension d. the function defined by
(2) If MST(S) denotes the cost of the minimal
Euclidean length spanning tree of S = {xl, 9. (xl, X2,, XJ = Min 1 Xj Xk
X2,. X0 C [0, 11d, then satisfies
(d 1)1d
MST(S) < a'n (1.2)
d 11d

where the (smaller) constant a' again depends 9. ( Xl, X2, X. ) 1< Pdm . (1.3)
only on the dimension d. d The problem that motivates this article is the
Inequalities (1.1) and (1.2) have been the focus derivation of an analogue to (1.2) for the heuristic
of many investigations and considerable ingenuity construction of a spanning tree of S= { xl,
X2., XJ that is based on minimal sequential
insertions. It turns out that one has the precise
analogue of (1.2), and, moreover, an a priori in
Research supported in part by National Science Foundation equality analogous to (1.3) is valid in an average
Grant no. DMS88128668. sense. Before developing that a priori inequality

01676377/89/$3.50 0 1989, Elsevier Science Publishers B.V. (NorthHolland)

137

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

(which is somewhat technical in appearance), we Saxe (1980). The final section points out some
examine some of its consequences. open problems.
Given a fixed sequence S = (x,, X,__ XJ of
points in R d, then we can build a spanning tree
for S by sequentially joining x, to the tree formed 2. Sequential insertion trees
by {x,, x2_. , x,,} for 2 < i ~< n. If r, =
r(xi; Xl, X2 xi 1) denotes the shortest dis The following lemma provides the basis for
tance between xi and a point of {x,, X2, many inequalities involving ri. The key point is
i.e.
that the righthand side of (2.1) does not depend
r, = r(Xi; X,, X2, Xi1) = min 1 Xi Xj on n and the lefthand side is almost the distribu
jA_stant is given explicitly, more attention has been
then r, is the minimal cost of joining xi to a vertex given to obtaining a simple bound than to obtain
of a spanning tree of (Xl, X2,.x,,}. Natu ing the sharpest possible constant. Although, the
rally, r(xi; Xl, X2., Xi1) is at least as large as constant is irrelevant for most applications, it
gi(Xl, X2, Xi) for each 1 ~< i < n, and by easy would be of interest to know if the righthand side
examples one can show r can be much greater of (2.1) could be replaced by c d for some c not
than g. The cost of the minimal insertion tree depending on d.
constructed by sequentially connecting x, to the
tree already constructed on {xl, X2, .' Xii} is Lenuna 1. If {xl, X2,. ,XJ C [0,1]d and
given by the sum of the r,, 1 < i < n, and will be r,= min Jxjxl forldenoted by MIT(S). Despite the fact that the ri
are all at least as large as g, it turns out that a
bound is available for the sum of the ri that is not then for any 0 < x < oo we have
substantially larger than the bound available for d/2
MST(S). Specifically, there is a constant, yd, de r," <, 8dd (2.1)
pending only on the dimension d such that for all x~ri<2x
n > 1, we have
Proof. Let C = (i: x < ri < 2x} and for each i E= C
MIT(S) r(xi; Xl, X2,. Xi1) let Bi be a ball of radius 4r, with center x,. We
1 < i (d 1)1d will argue by contradiction that B, n B. 0 for all
ydn i
Each of the inequalities (1.1), (1.2), and (1.5) rj <, 2x gives us
says that the average cost of each edge is of order 1 xi xi ZW, + ri) < X. (2.2)
n 11d. The twist in (1.5) is that one maintains the
same order of magnitude of average cost even But, by definition of rj we have 1 x, xj 1 >, rj for
when one is required to insert the xi into the tree all i in a specified order. mands in (2.1) we have x <, rj, so we also see
The next section develops a general method for 1 x, x, 1 >, x. Since 1 xi xj x contradicts
obtaining bounds in'sequential problems, and, in (2.2), we have Bi n B, = 0.
particular, a proof is given of (1.5). The third Now, since all of the balls Bi are contained in a
section modifies the basic method to provide sphere with radius M1/', the fact that they are
bounds for the nearest neighbor heuristic for the disjoint tells us the sum of their volumes is
traveling salesman problem. In Section 4, the sum bounded by the volume of the sphere of radius
of the dth powers of the edges of the minimal 2d 1/2 . Thus, if 4Wd denotes the volume of the unit
d
insertion tree are contrasted with those of the ball in R we have the bound
minimal spanning tree and some surprising dif ddd12
ferences emerge. The fifth section draws out the Y1 Wd rid4 1< Wd 2
connections and distinctions between the present ies
technique and a related method of Bentley and from which (2.1) follows. E]

138

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

Information is easily extracted from (2.1) if one Proof. Setting IP(x) = 1 1x and [a, 81 n Ild,
is guided by the theory of linear programming dll'l in (2.3) gives us
where duality tells us that essentially all inequali gddl/2 1/2) 1 11d
ties that can be inferred from (2.2) are obtained by ri' < 2 (log(d logGn W
taking linear combinations of (2.2) with positive n11d < r, (2.6)
coefficients. This principle is made explicit by
inequality (2.3) and its applications to the proof of and the small values of ri d are trivial to bound
(2.5) and (1.5). d
The restriction in Lemma 2 to nonincreasing ri (2.7)
functions is convenient, but inessential. Similar ri < n 1d
inequalities can be derived for t that satisfy less so (2.5) follows from (2.6). E]
taxing regularity assumptions.
Inequality (2.5) can be used with H61der's in
Lemma 2. f is a positive and nonincreasing func equality to provide a bound on the basic sum of
tion on the interval (0, d 1/21, then for any 0 < a < (1.5), Er,, but in this instance the bound one
1/2
d obtains is not the best possible. Specifically,
r,"' t(r,) < 2 . 81d1/2 P t(x) dx. applying H61der's inequality with p = d and
q = dl(d 1) to E"=, ri 1, we find
(2.3) n )11d.
E r, <
cild (dl)ld
d n (log n (2.8)
Proof. By (2.1) we have
ddd12, This bound falls short of the desired inequality
rdI(x < ri < 2x) < 8 (1.5) by a factor of (log n )11d; so to prove (1.5),
is we use (2.3) directly. Letting P(x) = 11x d and
where I(x < r, < 2x) = I('r, < x < ri) is the indi 11d 1/21
1 (a, p)= [n , d in (2.3), we bound the sum
cator function. If we multiply by *(x) and in of the larger r,,
tegrate over 1.1a P], we find
rd 4dd12 Ild ri < 2 .8 ddd12 (d 1)
f dx < 8 f ~k(x) dx. r,;" ',
a ~~ r,:Q 0 r,12 a/2 ., (2.4) . {2 d 'n (d 1)1d d (d 1 1)/2 (2.9)

Since * is nonincreasing, the integrand on the To handle the smaller ri we again rely on the
lefthand side of (2.4) is bounded from below by trivial bound
11d= (dl)ld
t(rj), so r,.<_n.n n (2.10)
l^r,)r,12 < f ri *(x) dx, rjn~lld
r,12 and from (2.9) and (2.10) we conclude that (1.5) is

and (2.3) follows from (2.4). E] proved with yd = 1 + 2 4ddd/2 1(d 1).

Our first application of (2.3) is to bound the
sum of the dth powers of the r,. In Section 3 it 3. Sharpness of bounds for AW's and MSrs
will be proved that this bound cannot be essen
tially sharpened, even though we show that the The presence of the logarithmic factor in (2.5)
minimal spanning tree satisfies an essentially and the fact that it could be removed from (2.8)
stronger bound. might lead one to question if (2.5) is sharp. Suspi
cions may be further raised by noting that Gilbert
Lemma 3. There is a constant Cd depending only on and Pollak (1968) showed for d = 2 that E 1 e j2 is
d such that for all n >, 2, uniformly bounded for any minimal spanning tree
n of { xl, x2, , x n }c: [0,1]`. It will be proved
r id < Cd log n. (2.5) shortly that (2.5) is nevertheless sharp, but first it
is instructive to look more deeply at the inequali
139

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

ties that hold for sum of powers of edge lengths of Pythagorean Lemma. Given n points in a right
minimal spanning trees. triangle, there is a path through the points such that
The result of Gilbert and Pollak can be ex the sum of the squares of the edges on the path is
tended to arbitrary dimensions by using some bounded by the square of the hypotenuse.
recent results from the theory of spacefiffing
curves. In fact, the proof one obtains for general This lemma can be viewed as an adjunct to the
d > 2 is substantially simpler than the barehanded Pythagorean theorem, and, when n = 3 and the
proof given by Gilbert and Pollak in d = 2. points are the vertices of the triangle, the lemma
We will make use of a theorem of Milne (1980) even reduces to that result. For the details of the
that says there is a function f from [0,11 onto proof of the lemma, one can consult Newman
[0,11d which is Lipschitzian of order 11d, i.e. (1982, p.8); but, if we gloss over some small
points, it is easy to sketch the idea. The first trick
I f(X)_f(y) I sis: any set of n points including the end points of
for a constant c > 0. To apply this to the bound the hypotenuse can be joined by a path with the
ing of edge weights of a minimal spanning tree, we required bound. Next, drop a perpendicular from
suppose V,= {Xl, X21 , Xn) C: [0,11d and let T the rightangled vertex to the hypotenuse, and add
be a minimal spanning tree of V Since f is the point that was at the right angle to each of the
su~ective we can choose {yl, y2, , yJ in [0,11 sets of points in the two smaller triangles created
such that f(y) xi. Next, we order the yi to by the bisection. One then applies the induction
given Y(I) < Y(2) * * * < y(,,) and define a new hypothesis and the Pythagorean theorem to com
suboptimal spanning tree T. on V,, by choosing plete the proof.
the n 1 edges (f(y(i)), f(y(i,,))), 1 < i < n. By The fact inequality (2.5) of Lemma 3 cannot be
the optimality of T and the Lipschitz property of essentially improved is a consequence of the fol
f, we find lowing lernma that also shows that (1.3) is essen
tially sharp. As before, Wd denotes the volume of
1 e Id'< E leld the ball of unit radius in R d.
eeT e c= T.
11d
n1 Lemma 4. For each d >, 1, the constant d W,7
1 f (Y(i)) Ay(i+ 1) d has the property that for any { xl, X2 Xn } C
[0,11d there is an Xn11 E= [0,1]d such that 1 x,
n1 X.+i 1 >, 8dn 11d for all 1 < i < n.
d _y(i+l) 1
c Y(i) Cd. (3.2) Proof. If B, is a ball of radius 8dn11d and Center
X,, then by setting A = U,n_ 1Bi we find the com
In comparison with the complexity of the proof plementary set A' has measure at least 1. We can
of Gilbert and Pollak (1968) in d = 2, this proof of thus take Xn I l to be any point in Ac. 13
the uniform boundedness of E 1 e 1 d is almost ef
fortless. All of the geometry of [0,11d has been Returning to (2.5) we can use Lemma 4 sequen
compressed into Ahe existence of the Lipschitz tially to construct {xi, 1 < i < oo} so that for all
spacefilling curve. The use of spacefilling curves to i >, 2,
provide heuristics for a variety of optimization r, > 11d
problems is surveyed in Bartholdi and Platzinan ' 8d(, _ 1) (3.3)
(1988). 1d From (3.3) we see that the order of the bounds
Before leaving the problem of bounding E j e (1.5), (2.1), and (2.5) cannot be improved.
for minimal spanning trees we should note a fact
that makes it possible to give an easy, new proof
of the Gilbert and Pollak result in d = 2 that does 4. Nearest neighbor traveling salesman path
not require the relatively sophisticated machinery
of spacefWing curves. The key observation is given Consider the path through {Xl, X2, . Xn } C
in the following. [0,1]d that is obtained by starting at xl and subse

140

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

quently going to the nearest unvisited point. If we The constants in (4.5) and (4.6) are exactly the
let xj,, Xj2' ' ',,j, denote the ordering of the points same as those in (2.5) and (2.11).
on the path, then the n 1 values defined by
7, nfint IXi,_Xk I: Xk(5~= tXjl, X,, Xj}, 5. BentleySaxe method
< i < n, (4.1)
are the lengths of the edges in the path. Naturally, The method developed in the preceding section
we also have has a close relation to a method developed earlier
in Bentley and Saxe (1980). Two immediate (but
Fi xj, xj,,, 1, 1 < i < n. (4.2) inessential) distinctions between the BentleySaxe
Although the F, are computationally and con method and that developed here are: (1) the Bent
ceptually more difficult than the ri of Section 2, leySaxe method was developed only in d = 2;
we shall find inequalities analogous to (1.5), (2.1), and (2) the BentleySaxe method focused on the
(2.3), (2.5) and (2.7). Since each of these inequali overall cost of heuristic tours and paths rather
ties are consequences of (2.1), we only need to than the cost of sequential connection.
prove the analogue of that one bound. Even in A more important distinction is that Bentley
this step, one sees close parallels to the proof of and Saxe base their bounds is the notion of a
compatible kcircling. If Cl, C2_., C,,, are m disks
Lemma 1. in R 2 , k>,2, and cent(C) denotes the center of
Lemma 5. If S= {Xl, X21 X0 C [0,11d and the disk C, then we say { Ci: 1 <, i < m} is a com
Fi, 1 < i < n, are the edge lengths of the nearest patible kcircling provided
neighbor path, then for any 0 < x < oc we have cent(Ci) E= [0,112, (5.1a)
pd~
, < ddd12.
57 8 (4.3) aCi n [0,112 * 0, (5.1b)
x ~Qri<2x and

Proof. Let C = {i: x < Fi < 2x} and for each i EE C for any lllet B, be a ball of radius AF, with center x,, where such that the relationship
the j, are defined as in (4.1). As before, we claim cent(c,,) e nk IC. fails.
that B., n B, 0 for all s < t and we argue by The main lemma of Bentley and Saxe says that for
contradiction. If BnB,00, then any k >, 2 and any kcompatible circling
xi, xi, + P,) < X. (44) {Cl, C,
...... C.} satisfying radius (C,) >, a, for all

Since s < t, we see j, tt {jj, j2, , j, }, so (4.4) 1 < i ~<, m, one has
and (4.1) imply F, < x. But P, < x contradicts the m + 2).
assumption that s . E= C, so we conclude B., n B, = 0 radius(C) < 4(k 1)(1 )/(1Ta (5.2)
for all s, t contained in C. The remainder of the 1
proof is exactly as in Lemma 1. 13 Bentley and Saxe further show that if one places
a circle 6, of radius F, about xj, where
Xil' Xj2~ ~ Xiis the nearest neighbor tour of
It remains only to summarize some conse {X,, X21 1 Xj C [0,1]2 described in Section 4,
quences of (4.3). then S = { e,: 1 < i < n } is a 3compatible cir
cling. Since a subset of a compatible circling is
Corollary. For all n >, 1, compatible, Bentley and Saxe apply (5.2) to ap
n propriately chosen subsets of S to obtain
pd
i < Cd log n, (4.5) Fi < 2~2n/1T + 0(log n) (5.3)
1
and and
n
(d 1)1d p2,
< 1,in (4.6) i < 8 log n + 0(1), (5.4)
1.Qi 141

1

Volume 8, Number 3 OPERATIONS RESEARCH LETTERS June 1989

where the implied Constants of the 0terms of theory of subadditive Euclidean functionals. For a
(5.3) and (5.4) can be made explicit. survey that addresses both probabifistic and de
terministic issues, one can consult Steele (1988)
where other open problems are also given.
6. Concluding remarks and open problems
We know by (2.5) and (4.6) that if E is the set Note added in proof
of edges for the minimal insertion tree or the
nearest neighbor tour we have that The Pythagorean Lemma of Section 3 is an
E le 1 d < C log n. (6.1) unpublished result of R. Gomory established in
eE=E 1966. It is cited by R. Adler in his comments on a
Moreover, by Lemina 4 one cannot essentially paper of S. Kakutani (see Collected Works of S.
improve (6.1) despite the fact that by the argu Kakutani, Vol. II, R.R. Kallman (ed.), Birkhauser,
ment of (3.1)(3.2) we have the stronger bound Boston, 1986, p. 444).
57 1 e 1 d < C (6.2) Also the argument in (3.2) is cited by Adler in
eEE d = 2 as due to Kakutani (1966, unpublished).
for the minimal spanning tree.
One intriguing question that is left open is
whether one has the analogue of (6.2) for the References
minimal travelling salesman problem. To see the
subtlety of this problem one first has to note that J.J. Bartholdi and L.K. Platzman (1988), "Heuristics based on
the argument in (3.1)(3.2) does not apply. The spacefilling curves for combinatorial problems in Euchdean
crucial feature of the MST is that the trees one space", Management Sc. 34, 291305.
1 d LL. Bentley and J.13. Saxe (1980), "An analysis of two heuris
obtains are the same for the weight function 1 e tics for the Euclidean traveling salesman problem% Pro
as for the original weights 1 e 1. This feature no ceedings of the Eighteenth Allerton Conference on Communi
longer holds for the travefing salesman problem so cation, Control, and Computing, 4149, University of Il
a new approach is required. One source of help linois, UrbanaChampaign, IL.
might be in Rosenkrantz, Steams and Lewis (1977), L. Few (1955), "The shortest path and the shortest road
through n points in a region", Mathematika 2, 141144.
where one finds several results that complement E.N. Gilbert and H.O. Pollak (1968), 11Steiner minimal trees",
the inequalities of this article. SIAM J. Appl, Math. 16,129.
To motivate the second problem, first define L. Goddyn (1988), Personal communication.
the sequence g(n) by A.S. Goldstein and E.M. Reingold (1988), "Improved bounds
on the traveling salesman problem in the unit cube", Tech
tt (n) = max {MIT(S): S= { xl, X2 P Xn nical Report, Department of Computer Science, University
d }. of Illinois, UrbanaChampaign.
(6.3) S.C. Milne (1980), " Peano curves and smoothness of functions%
(d 1)1d Adv. in Math. 35, 129157.
We have seen that g(n)n is bounded. Is it S. Moran (1984), " On the length of optimal TSP circuits in sets
true that g(n)n (dl)ld converges as n oo? The of bounded diameter", J. Combin. Theory B 37, 113141.
analogous question for minimal spanning trees D.J. Newman (1982), A Problem Seminar, SpringerVerlag,
and traveling salesman tours has been answered in New York.
D.J. Rosenkrantz, R.E. Stearns and P.M. Lewis (1977), "An
the affirmative by Steele and Snyder (1988), but analysis of several heuristics for the traveling salesman
the fact that g(n) is not determined by an opti problem% SIAMJ. Comput. 6,563581.
mality property makes the behavior of g(n) more J.M. Steele (1988), Probabilistic and worst case analyses of
subtle. classical problems of combinatorial optimization in
The inequalities addressed in this article all Euclidean space", Technical Report, Program in Statistics
and Operations Research, Princeton University.
concerned deterministic point sets, but they are LM. Steele and T.L. Snyder (1988), "Worst case growth rates
closely related in spirit (and analytical content) to of some classical problems of combinatorial optimization%
a number of probabilistic topics especially the to appear in SIAM J. Comput.

142