Please see PDF version

Probability in the Engineering and lnj'ormational Sciences, 1, 1987, 4759. Printed in the U.S.A.


Program in Engineering Statistics
Princeton University
Princeton, New Jersey

A model for random trees is given which provides an embedding of the uniform model into an exponential family whose natural parameter is the expected number of leaves. The model is proved to be analytically and computationally tractable. In particular, a central limit theorem (CLT) for the number of leaves of a random tree is given which extends and sharpens Mnyi's CLT for the uniform model. The method used is general and is shown to provide tractable exponential families for a variety of combinatorial objects.


The theory of random combinatorial objects has shown a considerable devotion to the uniform distribution. This is a natural development, and a great number of interesting and useful results have been produced. Still, for the purpose of modeling and for technical applications, it seems valuable to be able to go beyond uniformity and the elementary extensions amounting to conditional uniformity.

Research supported in part by National Science Foundation Grant No. DMS8414069.

@ 1987 Cambridge University Press 02699648/87 $5.00 + .00 47


48 J. M. Steele

A simple technique given here will show how one can sometimes construct a combinatorially rich family of nonuniform probability distributions on a class of combinatorial objects. More pointedly, one can still retain many of the attractive features of the uniform distribution. Those features include (1) a sufficient analytical tractability to permit limit results like the central limit theorem, (2) the possibility of economical computer simulation to obtain realizations from the distribution, (3) the presence of one or more parameters to permit tuning of the family distributions to the circumstances of modeling applications, and (4) an intuitive meaning of the underlying parameters of the distributions.
To illustrate the technique as concretely as possible, attention will be focused here on the construction of a family of random trees. It will be seen that the constructed family is in fact an exponential family in the usual sense of parametric families of distributions. Moreover, the sufficient statistics for the natural parameters have basic combinatorial interpretations. This exponential family of trees has the benefit of permitting an easy asymptotic analysis, and it also permits two general techniques for simulations of random samples.
The study of random trees was initiated in R~nyi [20], where a model of random trees was built on the classical result of enumerative combinatorics that there are exactly n"' labeled trees with n vertices, (Cayley [3]).'
The basic object R6nyi chose to study was the number of leaves (i.e., vertices of degree one) of a tree chosen at random from the set of all n`2 labeled trees. If we let L, denote the number of leaves of such a random tree, then R~nyi's main results can be stated as follows:

E(LJ ~ nle as n  co

var(LJ ~ (e  2)nle 2 as n ). oo (1.2)

and, principally, the normalized number of leaves

Ln  nle (1.3)
1,~n 2)
e (e

is asymptotically normal.
In the second section we establish some structural prerequisites to the study of random trees; and, in particular, we introduce the coding structure needed to construct random trees under the uniform model. The third section introduces an exponential family of random trees where the number of leaves is the sufficient statistic for the family's natural parameter. The measureproducing process given there starts with the notion of a general Gibbs distribution. It then quickly specializes to the construction of a Gibbs distribution on the set of labeled trees where there associated energy function for the Gibbs distribution is the number of leaves of the tree.
Sections four, five, and six pursue the probabilistic analysis of this new family of measures on labeled trees. The main analytic technique depends on


establishing a representation of a random variable as a sum of Bernoulli random variables by showing that the probability generating function has all real roots. This powerful idea was introduced in the work of Harper [9] on the normal law for Stirling numbers of the second kind. One benefit of Harper's method is that R~nyi's central limit theorem can be sharpened to give a BerryEssen type result on the rate of convergence to normality. The Bernoulli representation also lets one easily deduce some subtle qualitative features of random trees like the unimodularity of the probability mass function of L,
Conceptually, the best thing about Harper's method is that the number of leaves of a tree chosen from the Gibbs' family is shown to be representable as a sum of independent Bernoulli random variables. The issue of random leaves is brought down to the problem of number of heads of (properly chosen) biased coins. The bias turns out to depend on the roots of certain simple polynomials. Curiously, there is still no combinatorial interpretation of the Bernoulli summands.
In section seven we see how to generate a random tree from our exponential family. Two methods are given and contrasted. One of these is based on the algorithms of Metropolis et al., [141 for generating an observation from a Gibbs distribution. This method has come recently into wide circulation because of the socalled method of simulated annealing in combinatorial optimization, (see e.g., Kilpatrick et al., [ 11 ]). A second method is given for generating random trees from the Gibbs measure which is based on more special features of our particular family of trees.
The final section discusses other combinatorial problems to which the program of Gibbs' measures and Harper's method can be applied. In particular, results are given for the choice of a random set from a set of size n, and to the choice of a random matching from the set of all matchings of a graph.


R~nyi's theory of random trees is based on a method of creating a onetoone correspondence between a labeled tree on n vertices P,, P2_ ..,P,, and an (n  2) tuple with elements from { 1, 2,. . ., n}. This correspondence is due to Prilfer [19] and is obtained as follows:

First we find the leaf Pi with the highest value of the index i. We then delete
Pi from the tree and record the index j of the unique vertex Pj to which Pi was
adjacent. This process is repeated on the pruned tree until n  2 vertices have
been. deleted (and two vertices are left remaining).

The sequence of recorded adjacency indices becomes the Prilfer code for the tree. One consequence of this coding is that Cayley's formula for the number of trees on n vertices is now rendered obvious; our code has length n  2 and each place can be held by n values, so n`2 is the cardinality of the set of labeled trees. For more detail on the Prfifer code as well as various generalizations and applications, one can consult Moon [151, Knuth [12], or Lov;1sz [131.


50 J. M. Steele

With the Prdfer code in hand, it is easy to obtain a formula for the number of trees with k leaves. From the construction of the code, one has the fact that Pi is a leaf if and only if i does not appear in the Prfifer code. The number of ways to choose the k leaves which are not to appear in the code is ("), k
and the number of functions from an (n  2) set onto an (n  k) set is (n  k) ! s(n  2, n  k) where s(n, k) is the Stirling number of the second kind (i.e., the number of ways to partition an n set into k subsets); so, if we let T(n, k) denote the number of labeled trees with n vertices and exactly k leaves, we then obtain, after R~nyi, that

(n2, nk)
T(n, k) = n! s k !

By using the wellknown recurrence relation s(n, k) = s(n  1, k) k + s(n  1, k  1) together with Eq. (2. 1), R~nyi [201 obtained the basic identity on which he based his asymptotic analysis:

n1 T(n, k) (n X j n2
r,   X (2.2)

k=2 (n n k)

One can easily obtain the asymptotic formula for the means, EL, ~ nle, just by letting x = n  1 in Eq. (2.2). Similarly, the variance relation var Ln
n(e  2)1e 2 is obtained by substituting x = n  2.
It is worth noting that there is some subtlety in the formula for var L,. The naive notion that L, behaves like the sum of n (or n  2) independent Bernoulli random variables with mean p = l le is supported by EL, ~ nle, but it would suggest an asymptotic variance of n (1  1 le) (1 le). The true asymptotic variance shows that the variance suggested by such a naive Bernoulli approximation is too big. The surprising fact which will be established shortly is that L, does nevertheless have the same distribution as the sum of n  2 independent Bernoulli variables. This fact is consistent with the asymptotics of EL, and var L,, only because these Bernoulli variables are not identically distributed.
The central limit theorem for L, was obtained by R6nyi by substituting x = 1  UNn into his key identity Eq. (2.2). The asymptotic analysis which then led to Eq. (1.3) was considerably more subtle than that which will be required here to obtain a stronger and more general result.


Before beginning a reanalysis of L,, under the uniform model, it will be useful to obtain two general perspectives on the distributions which can be placed


on the set of all labeled trees. First, we will consider a point of view common in statistical mechanics.
If S is any finite set and f: S R is any real function on S, then we can define a parametric family of measures on S by

e f W0
juo (S) = (3.1)
E e f (00

If S is thought of as the set of possible states of a physical system and if f (s) is considered to be the energy of the state s, then go(s) is called the Gibbs measure associated withf The statistical mechanical interpretation of go(s) is the probability of the state s. The expression 1/0 is interpretable as a measure of temperature: so, in the first place, states of higher energy are less likely than states of lower energy. In the second place, as temperature tends to infinity, the Gibbs measure becomes maximally disordered, i.e., uniform on all states.
When temperature tends to zero, g,3 becomes concentrated on the states s of lowest energy f (s); and, partly for that reason, the Gibbs measure has recently emerged as a tool in combinatorial optimization. The Gibbs measure is of basic importance in statistical mechanics, and for many models it has been studied in remarkable depth.
The formalism of Gibbs' measure has a convenient specialization to the set S of labeled trees on n vertices. In particular, if we let f (s) denote the number of leaves of the tree s e S and let Zffl) denote the denominator of Eq. (3. 1), then we can express Zffi) in terms of T(n, k),

Z(O) = 1: e`T(n,k) (3.2)

In general, the denominator of the Gibbs' measure of Eq. (3. 1) is called the partition function in the classical terminology, and Zffl) is the traditional notation.
One should note from Eq. (3.2) that Z(O) = nn2 ; and, if Pq denotes the probability distribution on random trees given by the Gibbs'rneasure IAO, then

PO(Ln = k) e~Ok T(n, k) (3.3)

If we let On denote the probability generating function for Ln under the uniform model (0 = 0) we have

On (X) = E Xkpo (Ln = k) (3.4)

and there is also a simple relation between Z and 0,

ZW) =On(e0)Z(0) . (3.5)

52 J M, Steele


We can now obtain an approximate representation for L, under both the uniform and Gibbs models. A key point is that whenever Harper's method is successful there is an attendant extension to an appropriate Gibbs model which comes almost for free.
Harper [91 showed that if Y, is the number of subsets in a partition of { 1, 2. . , n} which is chosen at random from the set of all B, = 1: s(n, k) such
partitions, then Y, has the same distribution as the sum of n independent Bernoulli random variables. To recall Harper's method, and to establish some facts which we will use later, we let *,(x) denote the probability generating function of Y, so

(x) = Ex Bn s (n, k )Xk (4.1)

Using a variant of Rolle's theorem, Harper was able to give a rather easy proof that *n(x) has all real roots. Since these roots must be nonpositive by the positivity of the coefficients of *n(x), we see there are real numbers ri ~!: 0 such that

II (X + ri)
(X) i~, = rl (pi x + q) (4.2)
n i=t
J1 (1 + ri)

where pi = (1 + rj)` and qi = ril(l + rj).
This factorization is equivalent to saying that Y, has the same distribution as the sum of n independent Bernoulli random variables. This was Harper's key probabilistic observation. It leads naturally to a central limit theorem for Yn and to a sharper understanding of the qualitative behavior of the Stirling numbers. As an example of the latter, we note that the Bernoulli factorization of Eq. (4.2) implies that the Stirling numbers are unimodal. This is a fact which, though apparent in computed examples, is otherwise taxing to establish.
Because of the explicit formula in Eq. (2. 1) for T(n, k) in terms of Stirling numbers, it is natural to begin an analysis of the probability generating function for L, by using what is known about *,. This requires only a little index shifting. We first note that by the defining relation of Eq. (4. 1), we have

Bn2*n2(X) L s(n  2, j)xj

and, by obvious substitution and multiplication,

F, s(n  2, k)yni = ynB,, 2*n2(')
j=1 y


By the fact that *,2 has only real roots, we see that
1: s(n  2, j)x'j = 0 (4.3)

must also have all real roots.
A remarkable fact about polynomials with real coefficients (c.f. P61ya and Szeg6 [ 181, p. 45) is that when an equation ao + a 1 x + a2 X' + a, x" = 0 has only real roots, then the equation

ao + a, + a2 + ... +an =0
l! 2! n!

also has only real roots. Applying the result of P61ya and Szegb to Eq. (4.3), we see that
2 Xnj n1 k
E s(n  2, j) E s(n  2, n  k) X (4.4)
j=1 (n j) k=2 k!

must have only real zeros.
This last expression is just a positive multiple of the probability generating function of L. which, using Eq. (2.1), can be written more explicitly than Eq. (3.4) as

OnW k nf (n2, nk) (4.5)
X n 2 l
22ss :sn1 n k.

By comparison with Eq. (4.4) we see the probability generating function 0,(x) has only real roots. This proves that Ln is equal to the sum of N independent Bernoulli random variables. One intriguing aspect of this representation is that there is (so far) no way to establish this representation except by the analytical method of analysis of roots. There is no graph theoretic interpretation of the Bernoulli summands.


The factorization of 0 not only provides a representation of L, under the uniform model, but it also yields a Bernoulli representation for L, under all of the distributions in the Gibbs family. To make good use of the representation, we need to obtain some rudimentary bounds on the mean E,3L. and variance varpL,.
Under the Gibbs measure bt,3(s), where f(s) is the number of leaves of the labeled tree s, the expression Pp (Ln = k) e 0k T(n, k) IZffi) lets us write the probability generating function On (X, 0) EOXL" in terms of the function On(X) = On(X,0) via Eqs. (3.4) and (3.5) as
(X, 0) = r
' Xk ek0T(k,n)IZ(P)


54 J. M. Steele

The fact that Eq. (4.5) has only real roots implies the same for 0,(x,~). If L,o denotes a random variable with probability generating function 0,(x,~), then again we can represent L,,3 as the sum of n independent Bernoulli random variables Xi, 1 :5 i:5 n. If the nonpositive numbers (rj), 1 :5 i:5 n, are the roots of 0, (x, 0), the relation 0, (x, 0) = 0, (xe ", 0) /0, (e ") shows that the roots of 0,(x,0) are just (eori), 1 :5 i:s n. This gives us an easy way to relate moments and other properties of L,o to the results obtained in the uniform case of L,.
Factoring 0, (x, 0) as we did *n (X) in Eq. (4.2), we see that L, p is a sum
of independent Bernoulli's Xi, 1 :5 i < n, where P(Xi = 1) = (1 + eori)',
1 :s i:s n, and are the n nonpositive real roots of 0,(x). We have
therefore that

EL1,0 (1 + el'ri)' (5.1a)

var Ln,fl elri(l + e,'ri) 2 (5.1b)

We can now show var 0Ln diverges at a linear rate since for all  00 <
0 < oo, we have the crude bound

n n
var OL, = E elriO + e.ri)' > e101E M1 + ri)2 (5.2)
i=1 i=1

This inequality is enough to permit us to prove a linear rate of divergence of var,9Ln because of the relation

E M1 + ri)2 = var oL, ~ n(e  2)e~2

which comes from Eq. (1.2).
It is natural to expect that as n tends to infinity we have

E0Ln ~ c(fl)n and var,6L, ~ v(fl)n (5.3)
for some constants c(O) and v(o). These are natural conjectures and progress toward them would seem to begin with generalizing R~nyi's Eq. (2.2). For the present we content ourselves with just having shown the linear divergence of var 3L,, since that is the essential result required for a central limit theorem.


For a sum Sn of Bernoulli random variables, the Lindeberg condition becomes simply the condition that var S, > co. From the linear divergence of var,9L,, we can then conclude that (L,  EpLn) (var pLJ'12 is asymptotically normal.
To do a bit better, we first note that if Xi is a Bernoulli random variable

i i i



then 1Xi  EXil < 1, so EIXi  EXi13 < var Xi. By the BerryEssen theorem, we then have for all n ~t 1,

PO ( (L,  E,3 LJ < t) _ 1 e _X212 dx :5 A e 1 ~31/2 n 1/2 (6.1)
~var  72;

where A is a universal constant (see e.g., Petrov 171, p. 111 or Feller [51, p. 544)
The Bernoulli representation of L, can be applied to give numerous additional inequalities of the central limit and large deviation type. In particular, one can apply the results of Bender [21 or Canfield [41. We are content to stop with the basic bound of Eq. (6. 1) because it is the most likely to be used and because it completes the program of providing an effortless refinement of the original central limit theorem of R6nyi [20].

To generate a tree which satisfies the law

P,3 (L,, = k) = e ,lk T(n,k)1Zffl), 2 5 k:5 n  1 (7.1)

for the number of leaves, we can adapt a general technique of Metropolis et al., [141 for generating an observation from a Gibbs distribution.
If 0 = 0, then the task of generating a tree satisfying Eq. (7. 1) is easy; one can just choose the n  2 coordinate of the Prilfer code by making uniform random selections from { 1, 2,. . . , n}. To expand this beyond ~ = 0, we let S denote the set of Prfifer codes; and, for s E: S, we define f (s) as the number of leaves of the tree associated with s. In terms of the PrUer code, we can equivalently express the defining property off (s) by noting that n  f (s)  2 is the number of distinct integers in the code s.
We make S into a graph by considering an edge to exist between a pair of Prilfer codes s and s' if the codes are equal except in one coordinate. The degree of any vertex of the resulting graph is thus (n  1 ) (n  2) since there are n ~ 2 coordinates which can be changed to n  1 other values.
. We now let a(s,s") = min f 1, e0111, where A =jir(s') f (s). Further, let N(s) denote the set of neighbors of s in the graph S, and define a Markov tran
sition matrix on S by
' a(s,s')1(n  1)(n  2) if s' e N(s)
P(S,S,) 1  E P(S,S, if S = S' (7.2)
0 otherwise

It is easy to verify that the Markov chain on S determined by {p(s,s')} has stationary distribution gp (s) = e IZ(O); one just cheeks the reversibility condition

g,3(S)P(S,S,) go(s,)P(S"S) 

i i

56 J M. Steele

The suggestion of Metropolis et al., [141 is that one generates an s E S in accordance with the probability law go by proceeding as follows:

Begin with an arbitrary s, and make N random transitions under {p(s,s')}.
If N is sufficiently large, then the ergodic property of Markov chains would
say that the terminal state s is reached with probability which is close to mp(s).

The charm of this method is that one never has to calculate Z(O); and, in our case, one does not even have to make a random choice from a set larger than (n  1) (n  2). The main theoretical drawback to the method is that it is hard to be confident of the proper choice of N. A minimal value should at least be a substantial multiple of the diameter of the graph S. For the current problem, we suggest an N at least as large as n log n, but one might more properly choose N = n2. A better understanding of the principles which might more authoritatively guide the choice of N can be found in AIdous [11.
A fortunate property of our problem of generating random trees from the Gibbs distribution is that there is an alternative method which is fast and exact. A benefit of this method is that it may provide a tool for pursuing a better empirical understanding of the Metropolis method.
Although g,3(s) = e,1f(')IZ(O) was developed in the general framework of Gibbs' measures, the Gibbs family of trees is really much simpler than the general case where Z(O) is typically intractable. The fact that T(n, k) = s(n  2, n  k)n! lk! and the classical recursion s(n, k) = ks(n  1, k) + s(n  1, k  1) for Stirling numbers of the second kind gives us an easy recursion for T(n, k), namely,

kT(n, k)ln = kT(n  1, k) + (n  k)T(n  1, k  1) (7.3)

and this relation makes the calculation of Z(O) easy.
To generate a Prfifer code satisfying the probability law tip, we first choose a random element k of 12, 3,. . . , n ~ 11 according to discrete distribution {e~kT(n,k)IZ(~)}nl. We then choose a random ksubset A c {1, k2
2,. . . , n} from the uniform distribution on ksets (see e.g., Nijenhuis and Wilf [161). Finally, for each of the n  2 coordinates of the Prilfer code s, we choose an. element at random from the uniform distribution on {2, 3,. . ., n  11 A.


The Gibbs model method of using a potential function to produce an exponential family of probability measures is a rich and powerful tool. It brings physical intuition and statistical technique into focus on problems which might otherwise seen amenable only to methods of enumeration. The Gibbs approach is one which we can expect to be applied in many combinatorial contexts. It seems particularly important in the construction of discrete models where an underlying combinatorial structure requires a probability law which cannot be easily constructed using independent random variables.

i 1


The most ready targets for the full program which was applied here for random trees are those combinatorial problems for which it is already known that an associated generating function has only real roots.
The best known candidate, much better known than Harper's Stirling numbers, is the problem of choosing an element at random from the power set of a finite set. There, of course, the Gibbs procedure just boils down to the extension of the binomial distribution with p = ' to the case of general p.
Let us consider in more detail the first nontrivial case, the Stirling numbers of the second kind. We recall that s(n, k) is the number of ways of partitioning an n set into k nonempty subsets. We define a potential function f on the set S of all partitions of ~ 1, 2,. . . , n} by letting f (s) be the number of elements of the partitions. We note that the Gibbs measure g,3(s) then yields a probability law for T,, the number of classes in a partition s chosen in accordance with g,3. The probability generating function for T, is

OnAX) = EXT" = r, e  gk X kS (n, k) IZ(O)

where the partition function Z(O) is determined by the relation (bn,,6(1) = I. Since Harper [9] showed 0,,,,(x) has only real roots, the whole program which was applied to Ln, the number of leaves of a random tree, can be applied to Tn. As a result, we have an exponential family of combinatorial objects which have a central limit theory and which are amenable to practical simulation.
The most farreaching class of combinatorial objects to which the Gibbs program is immediately applicable is probably that of matching in a graph. Briefly, a matching of G is a set of edges, no pair of which share a vertex. A kmatching is a matching with k edges, and the number of kmatchings of G is denoted by p(G, k) with p(G,O) = 1 by convention. The cardinality of the largest matching in G is denoted by P(G), and the matching polynomial of G is

g(G,x) r, (1)'p(G,k)x n2k

An amazing number of classical numbers and polynomials are related to p(G,k) and g(G,x) for special choices of G. For example, there is a G such that p (G, k) becomes the Stirling number s(n, n  k). The Hermite polynomial of degree n is the matching polynomial of the complete graph K, and even the Laguerre and Chebyshev polynomials are matching polynomials (see Godsil and Gutman [8], Godsil [61, and Godsil [71).
Heilmann and Lieb [10] established the remarkable result that for any G the polynomial g(G,x) has only real roots, and hence the matching generating function of G

~(G,x) r, p(G,k)Xk (8.2)

i i

58 J. M. Steele

must also have only real roots. Godsil [61 built on the result of Heilmann and Licb and established that if Z, is the cardinality of a matching chosen uniformly from the set of all matchings of G, then Z, is asymptotically normal, provided only that (GJn'=1 is a sequence of graphs, each regular with degree d, and the number of vertices of G, increases with n.
The Gibbs family which was developed here for labeled trees can be developed just as easily for the number of matchings in a graph. Since this construction brings tools of statistical mechanics, the exponential family, and the Monte Carlo method into focus on problems as a priori remote as the coefficient of the Hermite polynomial and the number of leaves of a random tree, the Gibbs method would seem to merit much further study in combinatorics.


1 wish to thank L. Lovdsz for informing me of the reference Godsil [6,71 which led to a simplification of my original method for showing var,3L.  m. 1 also thank Timothy L. Snyder for detailed comments on an earlier draft.


1. Here, of course, a tree is a connected graph without cycles and a labeled tree is one with labeled vertices. Also, to avoid trivial exceptions we will always assume n k 3.


1. Aldous, D. (1986). On the Markov chain simulation method for uniform combinatorial distributions and simulated annealing. Technical Report No. 60, Department of Statistics, University of California, Berkeley, California.
2. Bender, E. A. (1973). Central and local limit theorems applied to asymptotic enumeration. J. Combinatorial Theory A IS: 9 1 111.
3. Cayley, A. (1889). A theorem on trees. Quarterly Journal of Pure and Applied Mathematics 23: 376378.
4. Canfield, E. R. (1980). Application of the BerryEssen inequality to combinatorial estimates. J. Combinatorial Theory A 28: 1725.
5. Feller, W. (1971). An Introduction to Probability Theory and its Applications, Vol. II, Second Edition, Wiley, New York, p. 544.
6. Godsil, C. D. (1981 a). Hermite polynomials and a duality relation for matching polynomials. Combinatorica 1: 257262.
7. Godsil, C. D. (1981b). Matching behavior is asymptotically normal. Combinatorica 1: 369376.
8. Godsil, C. D. and Gutman, 1. (1981). On the theory of matching polynomials. J. Graph Theory 5: 137144.
9. Harper, L. H. (1967). Stirling behavior is asymptotically normal. Ann. Math, Statis. 38: 410414.
10. Heilmann, 0. J. and Lieb, E. H. (1972). The theory of monomerdimer systems. Comm. Math. Phys. 22: 190232.
11. Kilpatrick, S., Gelatt, C., and Vecchi, M. (1983). Optimization by simulated annealing. Science 220: 671680.
12. Knuth, D. (1968). Another enumeration of trees. Canadian J. of Math. 20: 1077~1086.
13. LovAsz, L. (1979). Combinatorial Problems and Exercises, NorthHolland, Amsterdam, The Netherlands, p. 29.


14. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953).
Equations of state calculation by fast computing machines. J. Chem. Physics 21: 10871092.
15. Moon, J. W. (1970). Counting Labeled Trees Canadian Mathematical Monographs, No. 1,
Alberta, Canada, p. 5.
16. Nijenhuis, A. and Wilf, H. (1978). Combinatorial Algorithms, Academic Press, New York,
p. 3539.
17. Petrov, V. V. (1975). Sums of Independent Random Variables, Springer Verlag, New York,
P. 111.
18. P61ya, G. and Szegb, G. (1976). Problems and Theorems in Analysis 11, SpringerVerlag, New
York, p. 45.
19. Prfifer, A. (1918). Neuer Beweis eines Satzes uber Permutationen. Archiv fir Mathematik und
Physik 27: 142~144.
20. R6nyi, A. (1959). Some remarks on the theory of trees. WA Mat. Kul. Int. Kozl. 4: 7385.