Please see PDF version

AN APPLICATION OF SYMBOLIC COMPUTATION TO A GIBBS MEASURE MODEL


J. Michael Steele, Princeton University

ABSTRACT

Gibbs models provide a flexible alternative intuition behind (1. 1), one should note that as r the
to traditional methods for specifying statistical measure g, tends to the uniform distribution on S. This
models. Because of the availability of cheap limit relation captures the notion that at high temperature
computation, the Gibbs alternative is now sub any state of S is as likely as any other state. Conversely, as
stantially competitive with traditional methods in r+ 0, the probability measure g, becomes increasingly
contexts where one can be satisfied with system concentrated on. the set {s: f (s) = minf (s')}, the interpre
simulations, but Gibbs models are at some disad
vantage in theoretical contexts because their tation of which is that at low temperature a few especially
analytical properties are more difficult to investi low energy states become extremely likely. '
gate than those of models built from independent Now, these stories from physics are all well and good
variables. but how do they influence probability models which do not
The first purpose of this talk is to present a have direct physical interpretation? The main point of
specific Gibbs model concerning random trees Steelle,(1987) is that combinatorial problems can often be
and show how MACSYMA proved of value in associated with an intuiti ve Gibbs model with a direct com
examining its theoretical properties. A second binatorial interpretation. In fact, the Gibbs paradigm does
purpose is to initiate a broader discussion of the not need much pushing to be seen to be extremely useful in
role of MACSYMA in statistical (and other many situations which are quite removed from the usual
scientific) research. problems of thermodynamics.
While the preceding remarks seem necessary to
motivate our specific problem, the main purpose of the
Introduction to Gibbs Models present discussion is to show how MACSYMA was useful
in the theoretical exploration of a simple Gibbs model from
Most traditional models in statistics are based on combinatorics. A second purpose is to review for a statisti
functions of independent random variables, or more gen cal audience some of the literature on the use of symbolic
erally, on variables whose joint distribution is specified by computation in applied science.
means of a covariance structure. Although such models
serve us well in many contexts, there are numerous prob
lems where physical or other phenomena exhibit a qualita
tive dependency which is not easily expressed by such 2. Trees of PrIlfer and R6nyi
techniques. It is in such cases that the flexibility provided The specific Gibbs model which concerns us has its
by the method of Gibbs models can be especially useful. origin in the hunt for a tractable parametric family of ran
To illustrate the idea of a Gibbs measure in as con dom trees which can serve to model those trees which one
crete a way as possible we will concentrate on measures on finds in problems of computer science. To get some per
finite sets S. In the description of Gibbs models it is useful spective we will recall some results concerning the uniform
to employ some evocative language. In particular, the set distribution on random trees.
S will be thought of as the set of possible states of a One of the earliest results in the enumerative theory of
physical system, and several notions of thermodynamics combinatorics is the remarkable theorem of Cayley which
will guide our notation. The most essential ingredient of says there are exactly n`2 labeled trees on n vertices.
the method is the function f : S+ E which we will call a When R6nyi (1959) took up the modeling of random trees,
potential function. The function f has several interpreta it was natural to consider the uniform model where any one
tions as either a measure of complexity or of likelihood, but of the Cayley's n`2 labeled trees would have the same
the bottom line is that we use f to define a measure on S probability of being chosen. The object of main interest to
by the formula R6nyi was the number of leaves of a random tree, and to
ef (sys obtain a handle on this problem, R6nyi relied on an elegant
MS) =  . (1. 1) code for labeled trees due to PrUfer (1918).
Here, of course, the denominator Z(r) is chosen in just the Pr5fer's simple code is directly applicable to many
way needed to make S have total mass one, i.e. g,(S) = 1 questions concerning bus. One builds a onetoone
and Z (t) c The choice of t as the indexing correspondence between a labeled tree on n vertices
P 1, P 21. P. and an (n  2)tuple with elements from
{ 1, 2, via the following recipe:
parameter reflects the historical origins of g, where t car
ries the interpretation of temperature. From the perspective First we find the leaf Pi with the highest value of
of statistics we should note that the change of variable the index i. We then delete Pi from the tree and
0 = lit brings (1.1) into the form of an exponential family record the index j of the unique vertex P, to
with natural parameter 0. To benefit from the physical which Pi was adjacent. Ibis process is repeated

237

1


on the pruned tree until n  2 vertices have been This model has a number of benefits. First, it is an
deleted (and two vertices are left remaining). exponential model so many of the nicest results of
Such sequence of recorded adjacency indices is now mathematical statistics apply to the estimation of 0. Also,
called the PrUer code for the tree, and one can show it is a Gibbs model so the method of Metropolis et.al. can
without great difficulty that the coding provides a bona fide be used to generate random trees which satisfy the given
onetoone correspondence. As a consequence of PrUer's law. Finally, as shown in Steele (1987), the random
coding, the formula of Cayley for the number of trees on n variables L. are asymptotically normal.
vertices becomes obvious; our code has n  2 places and The purpose of the next section is to illustrate how
each place can be held by n values, so n`2 is the cardinal MACS~ entered into the exploration of this model and
ity of the set of labeled trees. how it helped lead to the resolution of the asymptotic nor
Although it cuts a good story short, it suffices to say mality of L. under the Gibbs model.
that R6nyi used the Prilfer code to show that T(n,k), the
number of labeled trees with k leaves and n vertices, must 3. Harper's Method and Computational Empiricism
satisfy the polynomial identity The central limit theorem for L. was obtained in
Steele (1987) through the application of a technique which
T(n,k) was introduced by Harper (1967) for the study of Stirling
Xn_2 (2.1) numbers. Harper's basic idea is so simple it can be dis
cussed in a beginning probability course. Still, the method
k2 [nn is quite powerful, and, even when it fails, it often brings a
k new analytical insight into the problem.

This elegant expression permits one to make many So, what is this nifty method? Suppose you wish to
deductions concerning the number of leaves of a tree study a random variable Y with probability generating
chosen at random from the set of all n ` ~2 labeled trees. In function h (x), and you consider the equation:
particular, if we let L. denote the number of leaves of a h (x) = po +p x +p 2;x 2 +. . . +PJ', = 0. 3.1)
random tree then one can easily obtain an asymptotic for Now, suppose for some reason you know that (3. 1) has
mula for the expected 1 value for the number L,: only real roots, ri. 1 5 i 5 n. Since the pi are non
EL.  n le. (2.2) negative, the ri are all nonpositive, and h can then be fac
With more effort one can also obtain the variance relation: tored into the form
2
L~ ~ n (e  2)1e (2.3) h (x) = 5x + r,) 1 (1 + ri) = rnl(pi. + qi), (3.2)
i1 i.l

Finally, R6nyi obtained the central limit theorem for Now, with this before us virtually all probability questions
L,, by means of substituting X = 1 it l~n into his key about Y are easily resolved, since (3.2) says that Y is equal
identity (2.1). Several technical problems needed to be to the sum of n independent Bernoulli random variables.
resolved by R6nyi as he moved from (2. 1) to the asymptot
ilp. Returning to the Gibbs model of random trees, Steele
ics of Ee ' but his analytical expertise prevailed. (1987) showed that only a little manipulating is required to
Now the interesting point comes upon us. R6nyi's establish that (1) R6nyi's CLT (2) a BerryEssen
detailed analysis pertain to a model which is almost cer refinement of R6nyi's CLT and (3) the CLT for random
tainly too limited to apply to any of the empirical situations trees under the general Gibbs model will all follow easily
of interest! For example, suppose that for your set of once one proves that the equation
observed trees all of the trees have about n15 leaves, n1
instead of nle leaves. The uniform model would be inap Y_T(k,n)xk = 0 (3.3)
plicable, and it seems to offer no place to turn. The ques k2
tion becomes, how can we obtain a parametric family of has only real roots.
trees with a natural parameter closely related to the number
ofleaves? The handles we have on (3.3) are that T(n,k) can be
The answer is simple using a Gibbs model. We just expressed in terms of Stirling numbers of the second kind
let S be the set of all labeled tree on n vertices, and we by
define f (s ) to be the number of leaves of s E S. If we set T(n,k) = s(n2,nk)n!lk! (3.4)
1
0 = t_ ' we can write (1.1) in the form and the Stirling numbers satisfy the classical recursion
ge(s) = c of($)  0(0) (2.4)

where 0(0) = log(ye  'f(")) and 0'(0) = E,,L, is the s(n,k) = ks(nl,k)+s(nl,k1).
expected number of a random tree chosen according to the
probability measure (2.4). Once we learn enough about the geometry of zeros of
By varying our choice of 0 can make E.L. range polynomials, it is not hard to show (3.3) has only real roots;
through the whole interval 12,n11, and thus we have a but before even starting to learn what needs to be known, it
family of measures on trees which is essentially indexed by seems to be almost a psychological requirement to find out
the expected number of leaves. whether Harper's method really has a chance. After all, it

238


is quite possible that (3.3) does have nontrivial complex polynomial transformation which preserves the property of
roots. having only real 4oots. The interplay between Gibbs
models and zero location are amusing in the combinatorial
Fortunately, one can easily investigate (3.3) using
MACS~ and the following interactive code should context but there are also far more serious contributions
illustrate the point. (The clines are entered command and with a similar flavor. Kac (1974) tells a charming story
the dlines are MACSYMA's replies. The other notations how a result on zeros due to P61ya (1926) was instrumental
should either he intuitive, or at least easily understood after in the work of C.N. Yang and T.D. Lee in the theory of
reading the companion article by Rand (1987).) phase transition of a lattice gas.
4. MACSYMA in Applications
(cl) sfn,k]:=s[nl,kj*k +s[n1,k11; The use of MACSYMA reported here has several pro
(dl) s,,1,kk + S.1,kl; perties which seem typical of cases where MACSYMA
proves to be of real service in mathematical research. First,
(c2) forn:l through 50 do (s[n, 11:1); the assistance provided by MACSYMA came in the form
(d2) done of encouragement rather than proof. Second, MACSYMA
became engaged with the problem through the exploration
(c3) for n: 1 through 50 do (s [n. n of a significant example, rather than by frontal assault on
the general problem. Third, MACSYMA provided a con
(d3) done venient interactive environment for performing a calcula
tion which could have been performed with far humbler
(0) t [n, k 1 := s [n 2, n k factorial (k); tools.
Sn2Ak There is no reason to apologize for any of these pro
(d5) t.,k k 1 perties, and in fact, there is some power in embracing these
{Technical* Remark: One should note that t,,,k differ features as the honest path of MACSYMA application.
from the values T(n,k) of (3.3) only by a factor of n!, The problem considered here was amenable to a very
a factor not impacting the location of roots.} straight forward application of MACSYMA as a tool of
exploratory mathematics. Still, even in modest explora
(c6) p[nl(x):=sum(x^* tfn,jl,j,2,n1); don, MACSYMA sometimes will buckle under the weight
of a seemingly reasonable request. Fortunately, calcula
(d6) p.(x):= sum(xjt.,j,j, 2, n 1); dons in MACSYMA can be guided toward a problem along
(0) p 151(x); many different paths and sometimes such guidance makes
a big difference. For a study of such a case and its happy
X4 X 3 X 2 resolution, one should read the discussion provided in
24 2 2 Campbell and Gardin (1982) concerning some symbolic
{Technical Remark: This differs from the probability deterndnants.
generating function for L. by a factor of n!In'2, This report has emphasized the easy uses of
which for n  5 is a reassuring 24125.} MACSYMA, but MACSYMA can indeed support exten
sive development. The possibilities of such development
(c8) all roots are well brought out by Watanabe (1984) which reports on
(d8) (x = 0.0, x 0.0, x = 1.101020514433644, the successes and failures of a 1400 fine MACSYMA ODE
x = 10.89897948556636); solver as applied to the 542 equations solved in Karnke
(1959).
(C9) p 181(x); In addition to personal exploration, the best place to
learn the. capabilities of MACSYMA is Rand's Computer
7 3 1X6 3x 5 65x 3 5X3 X 2
(d9) L +  +  +  +  Algebra in Applied Mathematics: An Introduction to
5040 720 4 24 2 2 MACSYMA. The examples in that volume guide one
through a useful exploration of MACSYMA's facilities,
All roots again can do its job and find that the roots but its principle benefit is that the focus is kept on the use
are, to more places than we care to record, of MACSYMA in scientific work, in contrast to profes
{0, 0, 0.27, 0.99, 3.23, 14.20,198.28}. In the same sional involvement with symbolic computation. There are
way, we find for n = 15 the roots of p [151(x) are numerous surveys of the application of MACSYMA in sci
{0, 0, 0.065 56488.80}, and again we find that all ence and engineering, and one can get a good feeling for
of the roots am indeed real. the scope of MACSYNIA applications from Engeler and
Before leaving this example, or embarking upon any Mlider (1985), Fitch (1979), and Pavelle (1985).
summary observations about symbolic computation, some Finally, some comment should be made which ack
points should be made concerning zero locations and their nowledges that MACSYMA is but one of many symbolic
role in Gibbs models. First, we should note that proof of computational environments. Loos (1982) mentions that
the fact that (3.3) has only real roots is given in Steele there are at least 60 computer algebra systems. Besides
(1987), where a key lemma is provided by a result due to MACSYMA, the best known are probably muMATH,
P61ya and Schur (1914) which exhibits a polynomial to REDUCE, and SCRATCHPAD, and each of these has its

239


own virtues. One possibly historic development in sym
bolic computation took place this month with the introduction of the HewlettPackard 28C, the first handheld calculator to do symbolic algebra and symbolic calculus. If history serves as a reliable guide,. symbolic algebra will soon be as available as there are people with a desire to participate in its benefits. t Research supported in part by National Science Foundation Grant #DMS8414069.

Reterences

Campbell, LA. and Gardin, F. (1982). "Transformation of P61ya, G. and Szeg6, G. (1976). Problems and Theorems in
an intractable problem into a tractable problem: Analysis 11, SpringerVerlag, New York, p. 45.
evaluation of a determinant in general variables,"
Computer Algebra: EUROCAM '82 (G. Coos and J. P61ya, G. (1926). "Bemerkung 11ber die Integraldarstellung
Hartmanis, eds.), Lecture Notes in Computer Science der Riemannschen gfunction," Acta Math., 48, 305
144, Springer Ve rl ag, New York. 317 (also George Pblya: Collected Papers Vol 11
Location of Zeros, pp. 243255, R.P. Boas, ed., MIT
Engeler, E. and M5der, R. (1985). "Scientific computation: Press, Cambridge Massachusetts).
the integration of symbolic, numeric and graphic com
putation," in EUROCAL '85 (G. Goos and J. Hart P61ya, G. and Schur, 1. (1914). "Uber zwei Arten von Fak
manis, eds.), Lecture Notes in Computer Science, 203, torenfolgen in der Theorie der algebraischen
SpringerVerlag, New York. Gleichungen," J. Reine Angew. Math, 144, 89113
(also George Pblya: Collected Papers Vol 11 Location
Fitch, LP. (1979). "The application of symbolic algebra to of Zeros, 100124, R.P. Boas, ed., MIT Press, Cam
physics  a case of creeping flow," in European Sym bridge Massachusetts).
posium on Symbolics and Algebraic Computation (G.
Coos and J. Hartmanis, eds.), Lecture Notes in Com PrUer, A. (1918). "Neuer Beweis eines Satzes uber Perinu
puter Science 72, SpringerVerlag, New York. tationen," Archiv fur Mathem46k und Physik, 27, 142
144.
Harper, L.H. (1967). "Stirling behavior is asymptotically
normal, Ann. Math. Statis., 38, 410414. Rand, R.H. (1987). "Computer algebra applications using
MACSYMA," in this volume.
Kac, M. (1974). "Comments on 'Bemerkung l~ber die
Integraldarstellung der Riemannschen gfunction,"' pp. Rand, R.H. (1984). Computer Algebra in Applied
424426 (also George Pblya: Collected Papers Vol 11 Mathematics: An Introduction to MACSYMA, Pitman
Location of Zeros. pp. 243255, R.P. Boas, ed., MIT Advanced Publishing Program, Boston.
Press. Cambridge Massachusetts).
R6nyi, A. (1959). "Some remarks on the theory of trees,"
Karnke, E. (1959). Differential Gleichungen MTA Mat. Kut. Int. Ko.1., 4, 7385.
Usungsmethoden and Usungen, Chelsea, New York.
Steele, LM. (1987). "Gibbs measures on combinatorial
Loos, R. (1982). Introduction to Computer Algebra Sym objects and the central limit theorem for an exponen
bolic and Algebraic Computation, (eds. Buchberger, tial family of random trees, Probability in the
B., Collins, G.E., and Loos, R.), Computing Sup Engineering and Information Sciences, 1, 4759.
plementum 4, SpringerVerlag. Stoutemayer, DK (1985). "A preview of the next IBMPC
Metropolis, N., Rosenbluth, A.W., Roseribluth, M.N., version of muMATW' in EUROCAL'85. G. Goos and
Teller, A.H., and Teller. E. (1953). "Equations of state J. Hartmanis, eds., Lecture Notes in Computer Science
calculation by fast computing machines, J. Chem. Phy 203, SpringerVerlag, New York.
sics, 21, 10871092. Wa . tanabe, S. (1984). "An experiment toward a general
Pavelle, R. (1985). "MACSYMA capabilities and applica quadrature for second order linear ordinary differential
tions to problems in engineering and the sciences," in equations by symbolic computation," EUROSAM '84,
EUROCAL '85, G. Coos and 1. Hartmanis, eds., Lee G. Coos and 1. Hartmanis, eds., Lecture Notes in Com
ture Notes in Computer Science 203, SpringerVerlag, puter Science 174, SpringerVeriag, New York.
New York.

240