A Pri*ori Inequalities for the Euelidean

Traveling Salesman

Timothy Law Snyderi

Department of Computer Science

Georgetown University

Washington, DC 20057

J. Michael Steele2

Department of Statistics

The Wharton School

University of Pennsylvania

Philadelphia, PA 19104

Abstract

It is proved that there are constants cl, C2, and C3 such that for any set S of n points in the unit square and for any minimumlength tour T of S (1) the sum of squares of the edge lengths of T is bounded by cl log n, (2) the sum of edge lengths of any subset E of T is bounded by C2JEIV2, and (3) the number of edges having length t or greater in T is at most C3/t2. The second and third bounds are independent of the number of points in S, as well as their locations. Extensions to dimensions d > 2 are also sketched.

The presence of the logarithmic term in (1) is engaging because such a term is not needed in the case of the minimum spanning tree and several analogous problems, and, furthermore, we know that there always exists some tour of S (which perhaps does, not have minimal length) for which the sum of squared edges is bounded independently of n.

Research supported in part by Georgetown University

1991 Summer Research Award.

2 Research supported in part by the following grants:

NSF DMS8812868, NSA MDA90489H2034, AFOSR

890301, and DAAL0389G0092.

Permission to copy without fee all or part of this material is granted

provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by

permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission.

8th Annual Computational Geometry, 6/92, Berlin, Germany

@1992 ACM 897915186/92/0006/0344 $1.50

344

1

1. Introduction

The purpose of this paper is to provide a priori bounds on quantities related to the edge lengths of an optimal traveling salesman (minimumlength) tour through n points in the unit square. By a priori, we mean that the bounds are independent of the locations of the points. Studies of a priori bounds were initiated by Verblunsky (1951) and Few (1955). Few showed that for any set S of n points in the unit square, the length of an optimal traveling salesman tour of S is at most ,,f2n+ 1.75. Few's result led to a series of improvements, culminating in Karloff (1989), where it was shown that Few's constant could be reduced to less than vl~2. Our results continue in this tradition by giving a pri . ori . inequalities for three other quantities related to the edge lengths of an optimal traveling salesman tour.

The interest in and subtlety of our inequalities comes from the fact that, in contrast to the minimum spanning tree (MST) problem, optimal solutions to the traveling salesman problem (TSP) are not invariant under monotone transformations of the edge weights. Before giving further details on this connection and other related work, we state our main results. We let jel = ix ~ yl denote the Euclidean length of the edge

c = {x, y} with vertices x and y in R2 , and, in settings where the order of the edges of an optimal tour is not important, we represent a traveling salesman tour by the edge set {el,e2,...,e.}. In what follows, an "optimal" traveling salesman tour is a tour that is of minimum length when using Euclidean edge weights.

Our first theorem bounds the sum of squared edge lengths of any optimal traveling salesman tour.

Theorem 1. There exists a constant 0 < c, < oo such that ifT = {el, e2, ..., en} is an optimal travelingsalesman tour Of {Xl, X2,.. ., XJ C [0, 1]2 and if n > 2, then

n

E le,12 < c, log n.

i=1

(1.1)

Theorem 2 is a bound on the number of edges that are of length t or greater.

Theorem 2. There exists a constant 0 < c2 < oo such that, if v(n, t) is the number of ei E T such that lei 1 > t, then for all t > 0 and n > 1,

v(n,t) :5 c2/t2.

(1.2)

Corollary 3 gives a bound on the total length of any kedge subset of an optimal TSP tour.

Corollary 3. There exists a constant 0 < C3 < oo such that, if E = {ei, , ei2 , ' ~ ei,, } ~: T, then

1: lei 1 :5 C3Vfk_. iEE

(1.3)

It is interesting to compare these results to their minimum spanning tree analogues. Steele and Snyder (1989) proved MST analogues to (1.2) and (1.3), but these proofs were predicated on a solution to the MST problem via a greedy algorithm, hence were not applicable to the TSP. The best TSP analogue to (1.3) was thus vTr,,(n, t) :5 CTSPY1n/t, for some constant CTSP The bounds (1.2) and (1.3), however, are independent of n, the number of points, as well as the locations of the points. For this reason, we say that the significantly improved inequalities (1.2) and (1.3) are fully a priori.

345

Inequalities like (1.1) are important in simulations and investigations in which the square root computations required for Euclidean lengths are deemed to be too expensive (e.f the discussion in Steele (1990)). It was observed in Steele (1990) by an application of the spacefilling curve heuristic that one could obtain a result like (1.1) for the MST, but without the logarithmic factor. Though this result makes the logarithmic term of (1.1) seem disappointing, the present bound can still be of service in many applications where a fully a priori inequality would be used.

A further subtlety in (1.1) is that we can easily show there always exists some (not necessarily optimal) tour T' Of S = {Xl, X2, .. . , Xn } C [0, 1]2 for which E,eT' 1e12 < clog n, where c is a constant. This can be shown by observing via a pigeonhole argument that there exist xi and xj E 5, where i 0 j, such that Ixixj 1

1/2

is at most a constant times n ' then proceeding by an obvious induction. In fact, it has been known for some time that one can do considerably better; there is a constant c' and a tour 7' of S such that, for all n > 2, ZeET' 1e12 < c'. This can be obtained via the spacefilling heuristic as noted in the discussion of the MST, or it can be obtained by appropriately generalizing the Pythagorean Theorem (Neumann (1982)).

The sticky issue for the TSP is that, though there is some tour T' that makes EeET' 1e12 particularly small, there is no compelling reason to believe that a traveling salesman tour T that minimizes ZIET lel will do nearly so well. Because of the matroidal properties of the MST, such issues do not arise in its analysis. Analyzing the optimal TSP is much more difficult.

At present, we do not know of a way to remove the logarithmic factor in (1.1), nor do we have a lower bound that proves the necessity of the logarithmic factor. In the final section, we will comment further on this as well as problems concerning points in [0, 1]d for dimension d > 2. In Section 2, we prove two technical results that are applied in Section 3 to prove our main results.

This extended abstract is an abridged version of

the full paper. Some details have been removed and the original numberings of equations and figures have been retained.

2. Edge Lemmas

The second lemma of this section explicates a property of nonintersecting edges in an optimal TSP tour and will be useful in the next section, where we prove our main results. Our first lemma gives a simple geometrical bound concerning quadrilaterals that assists the proof of the second lemma. In the statement of Lemma 1, the term "diagonal" is used to denote a segment connecting nonadjacent vertices of a quadrilateraI, regardless of whether the quadrilateral is convex.

Lemma 1. Let L, and L2 be two nonin tersecting line

segments satisfying r < 1 Li 1 :5 flr, where 0 > 1. Sup

pose the midpoints of L, and L2 are separated by dis

tance A. If the endpoints of L, and L2 are joined to

form a quadrilateral Q with sides Ll, L2, 51, and S2,

then 1Si 1 < !(p 1) r + 3A for i = 1, 2. Moreover, the

2

lengths of the diagonals of Q are bounded by Or + A.

Proof.

The proof of Lemma 1, which relies on relatively simple geometric observations, is omitted for the extended abstract. It can be provided on request.

Lemma 2. Let {cl, e2, .. ., e,,} denote the edges of an optimal traveling salesman tour of 1xl, x2,..., x,,} C W. For each ei satisfying r < lei I :5,3r, let Di denote the disk of radius aleil centered at the midpoint of ei, where a = 1/22 and 3/2. If Di, Di., and Di, are three disks such that no pair of the edges ei, ei, and ei, has a vertex in common, then, for all r > 0, the intersection Di. n Di,., n Di, is empty.

Proof.

Without loss, we let ij = j for j = 1, 2,3. We show that if D,, D2, and D3 have a point in common, then it is possible to construct a shorter tour through {X,, X2? ... 9 Xn}. We can assume that e l, with midpoint

m,, is oriented along the x axis, the midpoint m2 of e2 lies above cl, and the midpoint M3 of e3 lies above e2. We can visualize the ei as illustrated in Figure 2, and note that there are two distinct cases that need to be considered.

. ...... . .

Figure 2. Three nonintersecting lines of a TSP tour

C1

ancl their A. Here, cc= 1/2.

Since the endpoints of the ei, {a 1 , a2 , a3 , bl 7 b2 j b3 } 3 are distinct and are on the tour, there is a pair ai, bj with i :A j such that ai and bj are joined by a path that contains none of the edges cl, e2, and C3. The two cases depend on whether ii ji = 1 or li ji = 2, c.f., the more schematic Figures 3a and 3b.

In the case of iiji = 1, we may assume that i = 3 and j = 2, as shown in Figure 3a. We form a new path by deleting e2 and C3 and adding the edges (a2, a3) and (b2, b3). By Lemma 1, we can estimate the net change in the path length as

ja2 ~ a31 + P2 b31 1e21 1e31

2 (fl 1)r + 6aflrl 2r

2

(fl 3 + 12a,3)r

since 1M2 M31 = A < 2aflr, for D2 n D3 z~ 0.

346

1

For the case of iiji = 2, i = 1 and j = 3, as shown in Figure 3b. We now get a shorter tour by deleting all the ei and adding the edges (b2, 1,3), (a3, bl), and (a,, a2). The net change in weight is

A2 = ib3 b21 + lbl a31 + 1a2 all jell 1e21 ie3J

:5 2 [ 1 (fl 1)r + 6agrl + flr + 2a,6r 3r

= (2,6 + 14afl 4)r,

where 11,3 b2] and Ja2 a, 1, are estimated as before and we estimate jb, a31 using the diagonal bound of

Lemma 1 along with the fact that JM1 M31 <

2a,3r, since D, n D3 V 0.

Figure 3a. Rebuilding the a2 to b., palli when

1 i j 1 = 1 in Ixinnia 2. The curved arc is a path,

ibe X'cd edges have been removed. and the

clashed edges have ken added.

a3 b3

Figure 3b. Rebuilding the tour when 1 i j i = 2

in Let rnia 2. llic curved arcs are paths, the X'cd

edges iavc been removed. and the dashed edcs

have been added.

The choices,3 = 3/2 and a = 1/22 are good enough to guarantee that Al < 0 and 1A2 < 0.

1:1

3. A Priori EdgeLength Bounds

It is now easy to prove our main results. We label the edges of an optimal tour T Of fXl, X2,.. ., XJ C [0, 1]2 in order as el,C2,...,e,,, and we assume without loss that n is even. We first construct disks Di of radius a lei 1 and center at the midpoint of ei for each 1 < i < n, where a = 1/22. Let Oi(.) denote the indicator function of Di, i.e., for all x E R2, 0, (X) = 1 if x E Di; otherwise 9Pi(x) = 0. Let A be the set of all odd i such that

347

r < lei 1 :5 Or, and let B be the corresponding set of even i. We claim the following:

1: + 1: 10j(x) :5 40(x), (3.1)

i C A iEB

where 0(.) is the indicator function of the square

[1,2]2.

To prove the claim, we note that since we have assumed n is even, pairs of edges with index belonging to A do not share an endpoint; the same is true for pairs in B. For P = 3/2 and a = 1/22, Lemma 2 tells us that no three disks of A intersect and that no three disks of B intersect. Hence, the point x E 1R2 can belong to at most two disks associated with A and two disks associated with B. Furthermore, since any disk with center in [0,1]2 and radius bounded by aflr is contained in [a,3r, 1 + aflr]' C [ 1, 2]2, we need only concern ourselves with x E [ 1, 2]2 . This proves the claim.

If we now integrate (3.1) over x, we obtain a basic bound on a subset of the squared edge lengths of an optimal TSP tour:

E lei I' < c,

r

where c = 36a2 7171. Finally,

n

1: leil' < 1 +

i=1 n112

m

+ E E leil,'

k=l P'Inlll

(3.3)

where m is the least integer k such that 0kn~112 > V(2. It suffices to take m = F1093/2(Vr2n)l 1 so applying (3.2). to (3.3) yields the bound

n

E lei 12 < c, log n,

i=1

where c, is constant as required by Theorem 1.

(3.4)

Returning now to (3.1) and again integrating, we see that since ici I > r for all i E A and i E B,

(JAI + J81) 7ra 2 r2 < 36. (3.5)

1

where 10(.) is the indicator function of the square 1, 212.

To prove the claim, we note that since we have assumed n is even, pairs of edges with index belonging to A do not share an endpoint; the same is true for pairs in B. For fl = 3/2 and a = 1/22, Lemma 2 tells us that no three disks of A intersect and that no three disks of B intersect. Hence, the point x E E2 can belong to at most two disks associated with A and two disks associated with B. Furthermore, since any disk with center in [0, 1]2 and radius bounded by aflr is contained in [a,3r, 1 + apr12 C [ 1, 2f, we need only concern ourselves with x E [ 1, 2]2 . This proves the claim.

If we now integrate (3. 1) over x, we obtain a basic bound on a subset of the squared edge lengths of an optimal TSP tour:

E lei12 < C,

r:Sieil

where c= 36a271. Finally,

lei 12 < + jejJ2

m

+ E

2

Y~

j lei 1

k=l 011n'/'

(3.2)

(3.3)

where m is the least integer k such that pk n 112 > ,,/2. It suffices to take m = P093/2W2n)l 1 so applying (3.2) to (3.3) yields the bound

n

E lei 12 < c, log n,

i=1

(3.4)

where c, is constant as required by Theorem 1.

Returning now to (3.1) and again integrating, we see that since lei 1 > r for all i E A and i E B,

(JAI + JB1) ira'r2 < 36. (3.5)

But, JAI+ JB1 = 1{i: r< leil so

j { i : r < iei 1 :S Pr } 1 < cr 2. (3.6)

348

If we now write v(n, t) as

&,(n, t) = 1 { i : iei 1 ~: t } 1

Mt1 (3.7)

< i : Oki < ieii :5 ok+lt

k=0

where mt = minj {flit > ~/2}, we can use (3.6) to bound v(n, t) as

m' 1

ii(n,t) < c E (,6kt)2

k=0

00

< ct 2 1: 02k (3.8)

k=0

c t2

1 02 1

which is Theorem 2, with C2 = C02/(01 _ l).

Corollary 3 now results from (3.8) by first noting that n v(n, x) is the number of edges in E of length x or less, then writing

1: leil = 1: leil + 1: ieil

iEB eiCE eiEE

ledkt leii

c2 + j v'2 x d(n v(n, x)).

t 2 t

(3.9)

Integrating the right most term of (3.9) by parts then applying (3.8), we obtain

V12

ft x d(n v(n, x)) = Y12(n v(n, ,/2_))

112

+ t(v(n, t) n) n(t V'2_) + it v (n, x) dx

:5 IF v(n, x) dx

C2/t.

(3.10) Inserting (3.10) into (3.9) and setting t = IEI 1/2 yields Corollary 3, with C3 = C2.

5. Concluding Remarks

This paper investigates features of an optimal TSP tour that can be explicated without any knowledge of the locations of the points, and, in some cases, even without

knowledge of the number of points. It is surprising that relatively tight bounds can be obtained under these conditions. Two of our main results bring bounds to the TSP that have been known for some time for the minimum spanning tree, but have been elusive for the TSP due to its NPcompleteness. Our results also create a new open problem: Can the logarithmic term in Theorem 1 be removed?

The preceding arguments can be generalized to d > 2, but, because of uncertainty concerning the logarithmic factor of Theorem 1, it seems inappropriate to give more than a sketch of a generalization to higher dimensions. The key idea is that in Lemma 2, we showed that if three of the Di associated with disjoint edges had a point in common, then we could find three disjoint edges el, e2, and e3 that were close together and nearly parallel. The existence of these edges then permitted us to construct a shorter tour. We used the fact that the ei did not cross to help provide bounds on the cost of modified paths; of course, such arguments are not available when d > 2.

Still, if we consider the possibility that a large number N(d) of dspheres Di = D(mi, alei I) C Rd intersect and that the surface of any sphere in Ed can be covered with a finite number M(c) of spherical caps with polar angle c, we can again show that we either have a bound like (3.1), with 4 replaced by a constant depending on d and E, or else we will have three edges that are sufficiently parallel to permit an argument like that used to prove Lemma 2. In summary, one can prove

Theorem 3. There exist positive constants Cd and cd such that for any traveling salesman tour T of {Xl, X2,.. ., X.} C [0, 11d and for all n > 2,

1: leld < Cd log n,

eET

and

(4.1)

vd(t) e E T > t C/ /td. (4.2)

d

Just as in dimension two, there is a serious possibility that the logarithmic factor in (4.1) can be removed,

349

1

but such an improvement does not seem to be obtainable by the present method.

We note that the methods of this paper are by no means restricted to the TSP; it is likely that they can be used to yield a priori inequalities for other problems, as well.

References

Few, L. (1955). "The shortest path and the longest road through n points in a region," Mathematika 2, 141144.

Karloff, H.J. (1989). "How long can a Euclidean traveling salesman tour be?" SIAM J. Disc. Math. 2,9199.

Neumann, D.J. (1982). A Problem Seminar, SpringerVerlag, New York, NY.

Steele, J.M. (1990). "Probabilistic and worstcase analyses of classical problems of combinatorial optimization in Euclidean space," Mathematics of Operations Research 15, 749770.

Steele, J.M. and Snyder, T.L. (1989). "Worstcase growth rates of some problems from combinatorial optimization," SIAM J. Computing 18, 278287.

Verblunsky, S. (1951). "On the shortest path through a number of points," Proc. Amer. Math. Soc. 2, 904913.