ARRANGEMENT SEARCHING AND APPLICATIONS TO ROBUST REGRESSION

J. Michael Steele, Princeton University

1. Introduction

The purpose of this report is to show the applicability of some recent work in computer science to a problem of current interest in robust regression. In particular, pointline duality and linesweep methods are applied to provide an algorithm for computing least median of squares regression lines. The algorithm which is obtained has a worst case time complexity of 0(n 2 log n) using only 0(n) space. This report should be thought of as an introduction to the more detailed papers Steele and Steiger (1986) and Souvaine and Steele (1986).

First, there is a brief review of some notions of robust regression and the motivation for regression methods with high breakdown point. The least median of squares regression method is introduced, and its computational issues are reviewed. The third section discusses pointline duality and its application to the problem of least median of squares regression. The fourth section develops the key observations of the sweep line method and sketches an algorithm for least median of squares regression.

The final section provides information about a second algorithm which is detailed in Souvaine and Steele (1986). The second algorithm uses more sophisticated data structures, has worst case complexity of only 0(n2), but suffers from a 0(n 2) space requirement Additionally, the final section reviews some other recent work in computer science which can be expected to have an impact on work in computational statistics.

2. Notions of Robust Regression and Breakdown Point

The breakdown point of an estimator is (roughly) the smallest amount of contamination of the data which can cause the estimator to take on arbitrary large values. The first extensive discussion of breakdown point is given in Donoho and Huber (1983), but they trace the concept to the thesis of F.R. Hample (1968). The discussion of Donolio and Huber lends much credence to the notion that the crude notion of breakdown is a very useful measure of robustness, especially as applied in small sample contexts.

Information about the breakdown properties of many robust regression proposals is provided in the key paper by Rousseeuw (1984) on Least Median of Squares Regression

(LMS regression), and LMS regression if found to compare favorably with all current proposals.

Formally, if we have data (xi,yi), 1 :5 i !~ n, and we wish to fit a line y = ax + b, we consider the set of residuals ri = yi axi b associated with a and b. We then choose as estimates the d and 1; which minimize the feature of merit

2 f (a, b) = median ri

1 !5 i !5 A

One point to note is that if median is replaced by mean then the corresponding feature of merit would lead to a rephrasing of the procedure of ordinary least squares.

The first considerations of the computation of LMS

regression are in Rousseeuw (1984) and Leroy and Rousseeuw (1984), but the first formal study was given in Steele and Steiger (1986). For (xi,yi) in general position, it was proved that f (oc, P) has at least cn 2 local minima. One consequence of this large number of local minima is that traditional local optimization methods will prove fruitless for determining d and 1;.

One must confront a combinatorial problem, and Steele and Steiger (1986) showed that there is a reduction to a tractable combinatorial procedure which yields an exact minimization of f via a finite search with worst time complexity of 0(n 3 ).

Souvaine and Steele (1986) give two faster algorithms for computing LMS regressions. The first of these depends upon the technique of point line duality and the sweepline method. This algorithm was shown to have worst case time complexity 0(n 2 log n) with a space requirement of 0(n). The second algorithm required more complex data structures, needed space of size 0(n 2 ), but was able to provide worst case time complexity of 0(n 2 ). For the purpose of this exposition, perhaps the best objective is to review just the first of these two methods and suggest how such technology is used in LMS regression. Suggestions are also made that might be useful in related work.

3. Point Line Duality and Its Application

The duality which will concern us here is given explicitly by the transformation T on points and lines which takes the point (a, b) to the line y = ax +b and which takes the line y = ax + b to the point (a, b).

Under this transformation, the point P that is determined by two lines L, and L 2 is mapped to the line TP which is determined by the two points TL , and TL 2. Likewise the line L determined by two points P 1 and P 2 'S mapped to the point TL determined by the two lines TP 1 and TP 2. These relations lie at the base of duality and they are shared by the classical transformation S of Poncelet which maps the point (a, b) to the line ax +by + 1 = 0 and maps the line ax + by + 1 = 0 to the point (a, b). It is interesting to. note that S 2 = 1, but T 2 * 1 where I is the identity mapping. This small loss of mathematical symmetry is well compensated by other properties of T which will be identified shortly.

In statistical work the point/line duality T has been used most recently in the work of Johnstone and Velleman (1985) which gives an algorithm for a different robust regression method, the resistant line. In data analysis and in simple linear regression, duality has also been used by Emerson and Hoaglin (1983), Dolby (1960), and Daniels (1954).

The benefit of the transformation T is that, in addition to the basic duality properties, it has some important order invariance properties. The first of these ordering properties is that the transformation T preserves the relationship of

245

i

i 1

J

J

EMEMEMOR~

'i

'.above" and "below" for points and lines. That is, if P is above L then TP is above TL. Secondly, the transformation T preserves the vertical distance between points and lines; TL is above TP by exactly the same distance that L is above P, etc. Understandably, this property is crucial in a problem where a median residual is to be minindzed over a selection of lines. One simply has to have some form of isometric invariance in the duality if it is to prove at all useful.

The final important order invariance property of T is that if the slope of L is larger than the slope of L, then the xcoordinate of TL is smaller than the xcoordinate of TV. This relation is trivial from the definition, but it is still very useful in practical applications.

In order to phrase LMS regression as a finite problem which can be precisely dualized, it is worthwhile to introduce some terminology relevant to its local structure. First, for any a and the line 1,,, = {(x,y): y = ax + P} defines residuals ri (c~ which are functions of a and P. We can

typically write ri ((x, P) as ri without fear of confusion. We

say the line 1C0 bisects three distinct points

(xi J yi J ) j = 1, 2, 3 if all of the ri J are of the same.magni

tude r but not all have the same sign. If xi, < x,, < xi, and

ri, = ri, ri, we say 1,, p equioscillates with respect to the points.

It was proved in Steele and Steiger (1986) that the LMS regression line must be an equioscillating line. Since

there are at most

such lines, a naive algorithm would

be to examine them alL Obviously, any algorithm faster than 0(n b must work differently.

Given an equioscillating line 1 there are two related lines with slope cc; the line L, determined by the points p, = (xi,,yi,) and P3 = (xi,,yi,), and the line L2 which goes

through the point P 2 = (xi,,yi,). A key property of the LMS regression line 1.0 is that the number K of points between L 1 and L 2 must satisfy

{1(n 4)12 n even

(n 5)12 n odd

provided n > 3 (c.L Steele and Steiger (1986), Main Lemma).

One simple method of determining the LMS regression line can now be given. For each triple of data points p l, p 21 and P 3 which are ordered by x coordinate, we first determine lines L, and L 2 as above. Next we find such a triple with exactly K data points between L, and L 2 and such that the vertical distance between L, and L 2 is minimized. The LMS regression line is the line which goes exactly between L 1 and L 2*

We can now see the rephrasing of our problem obtained by applying T to the data points and the lines they determine. The condition that there are K points between the lines L 1 and L2 becomes the condition that there are K lines between the points TL 1 and TL 2. Further, TL 1 is the point determined by the two lines TP 1 and TP 3; TL 2 is the point on the line TP2 which has the same xcoordinate as

Our algorithmic problem in dual can now be expressed as follows:

Given n lines Li, 1 :~ i !~ n, in general posi

tion in the plane with intersection points

PiP 1 5 i < j :5 n, find that line L* and inter

section point P* such that among all linepoint

pairs (L, P) which have exactly K of the Li cut

ting the vertical segment S joining L and P, the

pair (L*, P*) has the smallest vertical distance.

The sweepline technique for solving this dualized problem will be given in the next section.

4. SweepLine Algorithm and Space Efficiency

The sweepline technique "sweeps" the plane by combinatorially moving a vertical "sweepline" from the left to the right. Souvaine and Steele (1986) provide what is probably the first application of this technique in the area of regression, but Shamos and Hoey (1976) and Bentley and Ottinan (1979) have shown that sweepline techniques are applicable to a variety of problems in computational geometry.

The sweepline algorithm requires two offtheshelf data structures. The first of these (to be called) LIST will maintain our set of n lines in a geometrically meaningful order. LIST can be implemented as a simple linked list but we will want to augment it with some additional pointers which will help LIST interact with the second data structure.

Our second structure will be used to store a certain subset of n 1 out of the set of n (n 1)12 intersection points Pip The structure we create will permit us to access the 11 smallesf' element in time 0(1) and permits the insertion or deletion in time 0(log n). The ordering we will use to give meaning to "smallest" is that P,, <

they will be used and will help make the algorithm tran

sparent. We begin by computing a value A such that all of

the intersection points lie to the right of a line L 1 x = A.

This would be trivial to do this task in time 0(n by com

puting the values Pij successively and retaining only the

leftmost. It is instructive (and illustrative of a basic tech

nique) to see how this can be done in time 0(n log n).

To find A fast, we first note that to the left of all of the

by

intersection points the lines are themselves ordered slope, i.e. the line with lowest slope is above all of the other lines, etc. This observation implies that if we first sort the lines Li, 1 :5 i :9 n, by slope then the leftmost intersection point Pi, must be an intersection of two lines w ch are adjacent in the list of lines sorted by slope. Thus, to find the leftmost intersection point we can successively compute the intersection points of adjacent elements of the slopesorted list of lines, keep only the leftmost Pul and proceed down the list. This computation requires

246

Olit log n) time because of the preliminary sorting of the worth understanding that there are more sophisticated

.b

lines by slope. We did not have to dig out a clever way of methods for the systematic search of an arrangement and

executing this step, but doing so sets up a similar idea there may be instances where the associated overheads

which will be crucial for our overall time complexity. (both computational and intellectual) are justifled.

We next introduce another vertical line L which we There is one particularly relevant data structure due to

call the sweepline and we initialize L to L, The line L is Chazelle (1984) called a Hammock and which is used in

not essential to the algorithm but it helps articulate a pro Souvaine and Steele (1986) to provide a 0(n~ time 0(n2)

gram invariant which helps make verification of the algo space algorithm for LMS regression. Moreover, there are

rithm clearer. data structures for searching higher dimensional arrange

We initialize LIST by placing the Li into the list in mentsdue to Edelsbrunner et.al. (1983) which should be

decreasing order of L, intercept. We will also augment the very useful in the study of LMS methods for multivatiate

structure holding the L,~ by storint along with each of the Li regression (at least as a theoretical tool).

a set of four pointers Upl, UpK, Downl, and DownK which Finally, the technique of line sweep has been

will point to the lines which are 1 above Li, K above L,, extended recently by Edelsbrunner and Guibas (1986) to

and so forth in the ordering which is given by LIST. Since what they call "topological sweeping of an arrangement."

lines too near the top or bottom will not have lines satisfy This is an extremely promising technology which will be

ing the required conditions we set the pointers to the null sure to have an impact on computational statistics.

value to flag this situation. References

A key geometric observation parallel to the idea used

to compute L,, is that from the whole set of n (n 1)/2 inter Aho, A.V., Hoperoft, LE, and Ullman, J.1). (1974).

section points Pip 1 5 i < j :5 n, the one which is The Design and Analysis of Algorithms,

nearest to the sweepline L must be determined by a pair of

AdisonWesley, Reading, MA.

lines {L,, L,} which are adjacent in an ordered list which

was initialized by L,,intercept. This observation permits Bentley, LL. and Ottinan, T.A. (1979). "Algorithms

us to keep a set of candidates for the "next point to explord' for reporting and counting geometric intersec

which is at most of cardinality n 1.

tions", IEEE Trans. Comp., C28, 643647.

Formally, we pass through LIST and for each adjacent

pair Li and L, in LIST we install Pij in BEAP. We do this Chazelle, B. (1984). 1ntersecting is easier than

while respecting the ordering of the Pij by xcoordinate sorting", Proc. 16th ACM STOC, 125135.

and we end up with a heap of size n 1 with the leftmost

of the Pi, installed at the root. As an important adjunct to Chazelle, B. Guibas, L.L and Lee, D.T. (1983). "The

the heap building process we make a double set of pointers power of geometric duality," Proc. 24th IEEE

which associate the elements of BEAP and the elements of FOCS, 217225.

LIST. Specifically, for each adjacent pair Li and Lj in

LIST (and each corresponding intersection point PO in Daniels, H.E. (1954). "A distributionfree test for

HEAP) we create pointers from Li and L, to P,, and regression parameters," Annals of Mathemati

to Li and L,. cal Statistics, 25, 499513.

pointers from P ii

This step completes our preprocessing and initializa

tion phase, so we should make a preliminary tally of Dobkin, D.P. and Souvaine, D.L. (1986). "Computa

resources consumed. First, because of the sorting, LIST tional Geometry A Users Guide," Advances

requires time 0(n log n) to build and space 0(n) to store. in Robotics P Algorithmic and Geometric

Since BEA.P is also of size 0(n), it can be created in time Aspects of Robotics, LT. Schwartz and C.K.

Yap (Eds.), Lawrence ErIbaum Associates,

0(n) and space 0(n) using wellknown methods.

Hillsdale, NS.

The algorithm for computing the LMS regression line

is now almost trivial using the data structures we have Dolby, LL. (1960). "Graphical procedures for fitting

built. The picture to keep in mind is that of sweeping the the best line to a set of points," Technometrics,

vertical line L from L,, until it has passed through all of the 2,477481.

points. What we will do is manage LIST and HEAP in

such a way that (1) LIST always provides the ordering of Donolio, D.L. and Huber P.J. (1083). "The notion of

L, according to decreasing order of Lintercept, (2) the breakdown point", A Festschriftfor Erich L.

pointers stored in LIST remain consistent with their initial Lehman, P.L Birkel, K. Doksum, and LL.

definition, and (3) HEAP always stores as its minimal ele Hodges (Eds.), Wadsworth, Belimont CA,

merit the intersection point Pii which has smallest x 157184.

coordinate of any intersection point to the right of L.

Edelsbrunner, H. O'Rourke, J. and Seidel, R. (1983).

5. Final Remarks "Constructing arrangements of lines and

The sweepline method which was just discussed is hyperplanes with applications," Proc 24th

easy to implement and has substantial generality. Still, it isIEEE FOCS, 8391.

247

~ 1

i J

f 1

Edelsbrunner, H. and Guibas, L.G. (1986). '7opologically sweeping an arrangement," Technical Report~ Stanford University, Department of Computer Science.

Emerson, J.1). and Hoaglin, D.C. (1983). "Resistant lines for y versusx." In D.C. Hoaglin, F. Mosteller and J. Tukey (Eds.), Understanding Robust and Exploratory Data Analysis, John Wiley, New York.

Hample, F.R. (1968). "Contributions to the Theory of Robust Estimation," Ph.D. Thesis, University of California, Berkeley.

Harary, Frank (1972). Graph Theory, AddisonWesley Publishing Company, Menlo Park, CA.

Johnstone, I.M. and Vellernan, P.F. (1985). "The resistant line and related regression methods," J. Amer. Statis. Assoc., 80, 10411054.

1

1

Leroy, A. and Rousseeuw, P.J. (1984). "PROGRESS: A program for robust regression," Report 20 1, Centrum voor Statistiek en Operationell Onderzoek, Univ. Brussels.

Rousseeuw P.J. (1984). 1east median of squares regression," J. Amer. Statist. Assoc., 79, 871880.

Shamos, M. and Hoey, D. (1976). "Geometric intersection problems, IEEE FOCS Conference, Houston, TX.

Souvaine, D.L. and Steele, LM. (1986). "Time and space efficient algorithms for least median of squares regression," Technical Report, Program in Statistics and Operations Research, Princeton University.

Steele, LM. and Steiger, W.L. (1986). "Algorithms and complexity for least median of squares regression," Discrete Applied Mathematics, 13 509517.

248

1

AN APPLICATION OF SYMBOLIC COMPUTATION TO A GIBBS MEASURE MODEL

J. Michael Steelet, Princeton University

ABSTRACT

Gibbs models provide a flexible alternative intuition behind (1. 1), one should note that as t the

to traditional methods for specifying statistical measure g, tends to the uniform distribution on S. This

models. Because of the availability of cheap limit relation captures the notion that at high temperature

computation, the Gibbs alternative is now sub any state of S is as likely as any other state. Conversely, as

stantially competitive with traditional methods in r ) 0, the probability measure g, becomes increasingly

contexts where one can be satisfied with system concentrated on, the set {s: f (s) = minf (s')}, the interpre

simulations, but Gibbs models are at some disad

vantage in theoretical contexts because their tation of which is that at low temperature a few especially

analytical properties are more difficult to investi low energy states become extremely likely. '

gate than those of models built from independent Now, these stories from physics are all well and good

variables. but how do they influence probability models which do not

The first purpose of this talk is to present a have direct physical interpretation? The main point of

specific Gibbs model concerning random trees Steele (1987) is that combinatorial problems can often be

and show how MACSYMA proved of value in associated with an intuitive Gibbs model with a direct com

exandning its theoretical properties. A second binatorial interpretation. In fact, the Gibbs paradigm does

purpose is to initiate a broader discussion of the not need much pushing to be seen to be extremely useful in

role of MACSYMA in statistical (and other many situations which are quite removed from the usual

scientific) research. problems of thermodynamics.

While the preceding remarks seem necessary to

motivate our specific problem, the main purpose of the

1. Introduction to Gibbs Models present discussion is to show how MACSYMA was useful

in the theoretical exploration of a simple Gibbs model from

Most traditional models in statistics are based on combinatorics. A second purpose is to review for a statisti

functions of independent random variables, or more gen cal audience some of the literature on the use of symbolic

erally, on variables whose joint distribution is specified by computation in applied science.

means of a covariance structure. Although such models

serve us well in many contexts, there are numerous prob

lems where physical or otherphenomena exhibit a qualita

tive dependency which is not easily expressed by such 2. Trees of Prilfer and Wnyi

techniques. It is in such cases that the flexibility provided The specific Gibbs model which concerns us has its

by the method of Gibbs models can be especially useful. origin in the hunt for a tractable parametric family of ran

To illustrate the idea of a Gibbs measure in as con dom trees which can serve to model those trees which one

crete a way as possible we will concentrate on measures on finds in problems of computer science. To get some per

finite sets S. In the description of Gibbs models it is useful spective we will recall some results concerning the uniform

to employ some evocative language. In particular, the set distribution on random trees.

S will be thought of as the set of possible states of a One of the earliest results in the enumerative theory of

physical system, and several notions of thermodynamics combinatorics is the remarkable theorem of Cayley which

will guide our notation. The most essential ingredient of says there are exactly n`2 labeled trees on n vertices.

the method is the function f : S ~, E which we will call a When R6nyi (1959) took up the modeling of random trees,

potential function. The function f has several interpreta it was natural to consider the uniform model where any one

tions as either a measure of complexity or of likelihood, but of the Cayley's n`2 labeled trees would have the same

the bottom line is that we use f to define a measure on S probability of being chosen. The object of main interest to

by the formula R6nyi was the number of leaves of a random tree, and to

ef (SY1 obtain a handle on this problem, R6nyi relied on an elegant

Ms) = Z(t) code for labeled trees due to Prbfer (1918).

Here, of course, the denominator Z(t) is chosen in just the Pffifer's simple code is directly applicable to many

way needed to make S have total mass one, i.e. g,(S) = 1 questions concerning trees. One builds a onetoone

and Z(t) The choice of t as the indexing correspondence between a labeled tree on n vertices

PPP21 .... P,, and an (n 2)tuple with elements from

s {11, 2, n } via the following recipe:

parameter reflects the historical origins of li, where r car

ries the interpretation of temperature. From the perspective First we find the leaf P, with the highest value of

of statistics we should note that the change of variable the index i. We then delete Pi from the tree and

0 = lit brings (1.1) into the form of an exponential family record the index i of the unique vertex Pj to

with natural parameter 0. To benefit from the physical which Pi was adjacent. Ibis process is repeated

237

on the pruned tree until n 2 vertices have been This model has a number of benefits. First, it is an

deleted (and two vertices are left remaining). exponential model so many of the nicest results of

Such sequence of recorded adjacency indices is now mathematical statistics apply to the estimation of 0. Also,

called the MUfler code for the tree, and one can show it is a Gibbs model so the method of Metropolis et.al. can

without great difficulty that the coding provides a bona fide be used to generate random trees which satisfy the given

onetoone correspondence. As a consequence of PrUer's law. Finally, as shown in Steelc (1987), the random

coding, the formula of Cayley for the number of trees on n variables L. are asymptotically normal.

vertices becomes obvious; our code has n 2 places and The purpose of the next section is to illustrate how

each place can be held by n values, so n`2 is the cardinal MACSYMA entered into the exploration of this model and

ity of the set of labeled trees. how it helped lead to the resolution of the asymptotic nor

Although it cuts a good story short, it suffices to say mality of L. under the Gibbs model.

that R6nyi used the Prilfer code to show that T(n.k), the

number of labeled trees with k leaves and n vertices, must 3. Harper's Method and Computational Empiricism

satisfy the polynomial identity The central limit theorem for L. was obtained in

X Steele (1987) through the application of a teclmique which

T (n, k) kl was introduced by Harper (1967) for the study of Stirling

n2 numbers. Harper's basic idea is so simple it can be dis

X (2.1) cussed in a beginning probability course. Still, the method

k2 is quite powerful, and, even when it fails, it often brings a

[,. k new analytical insight into the problem.

This elegant expression permits one to make many So, what is this nifty method? Suppose you wish to

deductions concerning the number of leaves of a tree study a random variable Y with probability generating

chosen at random from the set of all n`2 labeled trees. In function h (x), and you consider the equation:

particular, if we let L. denote the number of leaves of a h (x) =PO +P IX +P 2:X2+ .+P"Xn = 0. (3.1)

random tree then one can easily obtain an asymptotic for Now, suppose for some reason you know that (3. 1) has

mula for the expected value for the number L,,: only real roots, ri, 1 :5 i :9 n. Since the pi are non

EL. ~ n le. (2.2) negative, the ri are all nonpositive, and h can then be fac

With more effort one can also obtain the variance relation: tored into the form

L. ~ n (e 2)1e 2. (2.3) h (x) = 5x + r,) 1 (1 + ri) = nn(pix+ qi) (3.2)

i"l i1

Finally, R6nyi obtained the central limit theorem for Now, with this before us virtually all probability questions

L,, by means of substituting x = 1 iti~n into his key about Y are easily resolved, since (3.2) says that Y is equal

identity (2.1). Several technical problems needed to be to the sum of n independent Bernoulli random variables.

resolved.~, R6nyi as he moved from (2. 1) to the asymptot Returning to the Gibbs model of random trees, Steele

ics of Ee a ', but his analytical expertise prevailed. (1987) showed that only a little manipulating is required to

Now the interesting point comes upon us. R6nyi's establish that (1) Rtnyi's CLT (2) a BerryEssen

detailed analysis pertain to a model which is almost cer refinement of R6nyi's CLT and (3) the CLT for random

tainly too limited to apply to any of the empirical situations trees under the general Gibbs model will all follow easily

of interest! For example, suppose that for your set of once one proves that the equation

observed trees all of the trees have about n15 leaves, ftt

instead of nle leaves. The uniform model would be inap Y_T(k,n)xk = 0 (3.3)

plicable, and it seems to offer no place to turn. The ques k2

tion becomes, how can we obtain a parametric family of has only real roots.

trees with a natural parameter closely related to the number The handles we have on (3.3) are that T(n,k) can be

ofleaves? expressed in terms of Stirling numbers of the second kind

The answer is simple using a Gibbs model. We just by

let S be the set of all labeled tree on n vertices, and we

define f (s) to be the number of leaves of s 4= S. If we set T(n,k) = s(n2,nk)n!lk! (3.4)

r1, we can write (1.1) in the form and the Stirling numbers satisfy the classical recursion

ge(s) = e (2.4)

where ~(0) = log(y_e ~Of W) and 0'(0) = E L, is the s (n, k ks (n 1, k) + s (n 1, k 1).

expected number of a random tree chosen according to the

probability measure (2.4). Once we learn enough about the geometry of zeros of

By varying our choice of 0 can make E.L. range polynomials, it is not hard to show (3.3) has only real roots;

through the whole interval Rn11, and thus we have a but before even starting to learn what needs to be known, it

family of measures on trees which is essentially indexed by seems to be almost a psychological requirement to find out

the expected number of leaves. whether Harper's method really has a chance. After all, it

238

is quite possible that (3.3) does have nontrivial complex polynomial transformation which preserves the property of

roots. having only real. mots. The interplay between Gibbs

Fortunately, one can easily investigate (3.3) using models and zero location are amusing in the combinatorial

MACSYMA and the following interactive code should context but there are also far more serious contributions

illustrate the point. (The c lines are entered command and with a similar flavor. Kac (1974) tells a charming story

the dlines are MACSYMA's replies. The other notations how a result on zeros due to P61ya (1926) was instrumental

should either be intuitive, or at least easily understood after in the work of C.N. Yang and T.D. Lee in the theory of

reading the companion article by Rand (1987).) phase transition of a lattice gas.

(cl) sfn,k]:=s[nl,kl*k +s[n1,k11; 4. MACSYMA in Applications

The use of MACSYMA reported here has several pro

(d 1) s.,k := s. _1,k k + s. _1,k _1; perties which seem typical of cases where MACSYMA

proves to be of real service in mathematical research. First,

(c2) for n:l through 50 do (s In, 11: 1); the assistance provided by MACSYMA came in the form

(d2) done of encouragement rather than proof. Second, MACSYMA

became engaged with the problem through the exploration

(c3) for n: 1 through 50 do (s In, n ]: 1); of a significant example, rather than by frontal assault on

the general problem. Third, MACSYMA provided a con

(d3) done venient interactive environment for performing a caicula

tion which could have been performed with far humbler

(0) t[n,k]:=s[n2,nk]lfactorial(k); tools.

(d5) t sn 2~4 ~k There is no reason to apologize for any of these pro

n k k ! ' perties, and in fact, there is some power in embracing these

{Technical Remark: One should note that t,,, differ features as the honest path of MACSYMA application.

from the values T(n,k) of (3.3) only by a factor of n!, The problem considered here was amenable to a very

a factor not impacting the location of roots.} straight forward application of MACSYMA as a tool of

exploratory mathematics. Still, even in modest explora

(c6) p[nl(x):=sum(x^* t[n,jl,j,2,n1); don, MACSYMA sometimes will buckle under the weight

of a seemingly reasonable request. Fortunately, calcula

W6) p. (x) := sum(x t. j, 2, n 1); tions in MACSYMA can be guided toward a problem along

(c7) p [5](x); many different paths and sometimes such guidance makes

a big difference. For a study of such a case and its happy

X 4 X3 X2 resolution, one should read the discussion provided in

(0) + + Campbell and Gardin (1982) concerning some symbolic

24 2 2 determinants.

{Technical Remark: This differs from the probability

generating function for L. by a factor of n!ln' 2, This report has emphasized the easy uses of

which for n = 5 is a reassuring 24125.} MACSYMA, but MACSYMA can indeed support exten

sive development. The possibilities of such development

(c8) all roots are well brought out by Watanabe (1984) which reports on

(d8) Ix = 0.0, x 0.0, x = 1.101020514433644, the successes and failures of a 1400 line MACSYMA ODE

x = ~10.89897948556636]; solver as applied to the 542 equations solved in Karnke

(1959).

(c9) p [81(x); In addition to personal exploration, the best place to

1 31x 6 3x$ 65x 3 5X 3 X 2 learn the capabilities of MACSYMA is Rand's Computer

(d9) X + + + + +, Algebra in Applied Mathematics: An Introduction to

5040 720 4 24 2 2 MACSYMA. The examples in that volume guide one

through a useful exploration of MACSYMA's facilities,

All roots again can do its job and find that the roots but its principle benefit is that the focus is kept on the use

are, to more places than we care to record, of MACSYMA in scientific work, in contrast to profes

{O, 0, A.27, 0.99, 3.23, 14.20~198.28}. In the same sional involvement with symbolic computation. There are

way, we find for n 15 the roots of p [151(x) are numerous surveyi of the application of MACSYMA in sci

{0, 0, 0.065 56488.80}, and again we find that all ence and engineering, and one can get a good feeling for

of the roots are indeed real. the scope of MACSYMA applications from Engeler and

Before leaving this example, or embarking upon any MIder (1985), Fitch (1979), and Pavelle (1985).

summary observations about symbolic computation, some Finally, some comment should be made which ack

points should be made concerning zero locations and their nowledges that MACSYMA is but one of many symbolic

role in Gibbs models. First, we should note that proof of computational environments. Loos (1982) mentions that

the fact that (3.3) has only real roots is given in Steele there are at least 60 computer algebra systems. Besides

(1987), where a key lemma is provided by a result due to MACSYMA, the best known are probably muMATH,

P61ya and Schur (1914) which exhibits a polynomial to REDUCE, and SCRATCHPAD, and each of these has its

239

i

i

i

1

own virtues. One possibly historic development in symbolic computation took place this month with the introduction of the HewlettPackard 28C, the first handheld calculator to do symbolic algebra and symbolic calculus. If history serves as a reliable guide,symbolic algebra will soon be as available as there are people with a desire to participate in its benefits. t Research supported in part by National Science Foundation Grant #DMS8414069.

References

Campbell, LA. and Gardin, F. (1982). "Transformation of P61ya, G. and Szegb, G. (1976), Problems and Theorems in

an intractable problem into a tractable problem: Analysis 11, SpringerVerlag, New York, p. 45.

evaluation of a determinant in general variables,"

Computer Algebra: EUROCAM '82 (0. Goos and J. P61ya, G. (1926). "Bemerkung Uer die Integraldarstellung

Hartmanis, eds.), Lecture Notes in Computer Science der Riemannschen gfunction," Acta Math., 48, 305

144, SpringerVertag, New York. 317 (also George P61ya: Collected Papers Vol 11

Location of Zeros, pp. 243255, R.P. Boas, ed., MIT

Engeler, E. and MAder, R. (1985). "Scientitic computation: Press, Cambridge Massachusetts).

the integration of symbolic, numeric and graphic com

putation," in EUROCAL '85 (G. Goos and J. Hart P61ya, G. and Schur, 1. (1914). "Uber zwei Arten von Fak

manis, eds.), Lecture Notes in Computer Science, 203, torenfolgen in der Theorie der algebraischen

SpringerVerlag, New York. Gleichungen," J. Reine Angew. Math, 144, 89113

(also George Pblya: Collected Papers Vol 11 Location

Fitch, LP. (1979). "The application of symbolic algebra to of Zeros, 100124, R.P. Boas, ed., MIT Press, Cam

physics a case of creeping flow," in European Sym bridge Massachusetts).

posium on Symbolics and Algebraic Computation (G.

Goos and J. Hartmanis, eds.), Lecture Notes in Com PrUer, A. (1918). "Neuer Beweis eines Satzes uber Permu

puter Science 72, SpringerVerlag, New York. tationen," Archiv fur Mathemcrik und Physik, 27, 142

144.

Harper, L.H. (1967). "Stirling behavior is asymptotically

normal, Ann. Math. Statis., 38,410414. Rand, R.H. (1987). "Computer algebra applications using

MACSYMA," in this volume.

Kac, M. (1974). "Comments on 'Bemerkung bber die

Integraldarstellung der Riemannschen gfunction,"' pp. Rand, R.H. (1984). Computer Algebra in Applied

424426 (also George Pblya: Collected Papers Vat 11 Mathematics: An Introduction to MACSYMA, Pitman

Location of Zeros, pp. 243255, R.P. Boas, ed., MIT Advanced Publishing Program, Boston.

Press, Cambridge Massachusetts). R~nyi, A. (1959), "Some remarks on the theory of trees,"

Karnke, E. (1959). Differential Gleichungen MTA Mat. Kut. Int. Kozi., 4, 7385.

Lbsungsmethoden and Usungen, Chelsea, New York. Steele, I.M. (1987). "Gibbs measures on combinatorial

Loos, R. (1982). Introduction to Computer Algebra Sym objects and the central limit theorem for an exponen

bolic and Algebraic Computation, (eds. Buchberger, tial family of random trees, Probability in the

B., Collins, G.E., and Loos, R.), Computing Sup Engineering and Information Sciences, 1, 4759.

plementurn 4, SpringerVerlag. Stoutemayer, D.R. (1985). "A preview of the next IBMPC

Metropolis, K, Rosenbluth, A.W., Rosenbluth. M.N., version of muMATW' in EUROCAL'85, G. Goos and

Teller, A.H., and Teller, E. (1953), "Equations of state J. Hartmanis, eds., Lecture Notes in Computer Science

calculation by fast computing machines, J. Chem. Phy 203, SpringerVerlag, New York.

sics, 21, 10871092. Wa 1 tanabe, 5. (1984). "An experiment toward a general

Pavelle, R. (1985). "MACSYMA: capabilities and applica quadrature for second order linear ordinary differential

tions to problems in engineering and the sciences," in equations by symbolic computation," EUROSAM '84,

EUROCAL '85, G. Goos and 1. Hartmanis, eds., Lec G. Goos and 3. Hartmanis, eds., Lecture Notes in Com

ture Notes in Computer Science 203. SpringerVerlag, puter Science 174, SpringerVerlag, New York.

New York.

240