Please see PDF version

Probabilistic Networks and Network Algorithms

Timothy Law Snyder
Department of Computer Science, Geotgetown University, Washington, DC 20057, U S.A.

J. Michael Steele
Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, US.A.

1. Introduction

The uses of probability in the theory of networks are extensive, and new applications emerge at an increasing rate. Still, when compared with the purely deterministic aspects of network theory, the part that calls upon probability theory is in its infancy. Certainly there are areas where the uses of probability have developed into a reasonably complete theory, but in many instances the results that have been obtained have to be regarded as fragmented and incomplete. This situation presents considerable opportunity for researchers, and the purpose of this chapter is to highlight aspects of the current state of the theory with an eye toward the developments and the tools that seem most likely to be of value in further investigations.
Probability enters into the theory of networks and network algorithms in several different ways. The most direct way is through probabilistic modeling of some aspect of the network. For example, in some freight management models the cost of transportation along the arcs of the network are modeled by random variables. In models such as these probability helps us grasp a little better a world that comes with its own physical randomness.
A second important way probability enters is through more stylized stochastic models where the aim is to provide deeper insight into our technical understanding of the methods of operations research. Here there is considerably less emphasis on building detailed models that hope to capture aspects of randomness that live in a specific application context; rather, the aim is to provide mathematically tractable models of reasonable generality that can be used to explore a variety of different computational or estimation methods. Among the types of issues that have been studied in such models are the efficacies of deterministic algorithms and of deterministic heuristic methods. Many of the 'average case' analyses of algorithms would fit into this second role for probability.

401
402 TL. Ayder, f.M. Steele

The third path by which probability enters into network theory is through randomized algorithms. This is the newest of the roles for probability, but it is a role that is of increasing importance. To make certain of the distinction that makes an algorithm 'randomized,' consider a version of depth-first search where one chooses the next vertex to be explored by selecting it at random from a set of candidates. Here one does not call on any modeling of the network, which may in fact be specified in a way that is completely deterministic. The use of probability here is purely technical in the sense that it is employed to serve our computations, not to model some external physical randomness, or even to capture the notion of an 'average case.'
In the material that follows, one does well to keep these differing uses of probability in clear sight. Still, the distinctions may not always be pristine, mostly because two or more roles for probability can be present in the same problem. As an example, consider the computation or estimation of the reliability polynomial R(p) of a network. Here one begins with a simple, physically motivated stochastic model. Given a specific graph intended to represent a communication network, one models the possibility of degraded communication in the network by allowing edges to 'fail' with probability p. The key problem is the determination of the probability R (p) that for each pair of vertices a and b in the graph there exists a path from a to b that consists only of edges that have not failed. As the problem sits, it offers a simple but useful stochastic model, and one can go about the calculation or estimation of R(p) by whatever tools are at one's disposal. The multiplicity of roles for probability enters exactly when one starts to notice that there are randomized algorithms for the estimation of R(p). This is just one example where there are several roles for probability in the context of a single problem.
There'are even dicier instances where the role of probability in the design and analysis of algorithms starts to offer some ambiguity. For example, close cousins of the randomized algorithms are the algorithms that (a) assume that the input follows some stochastic model and (b) exploit that assumption in the computational choices that it makes. A natural example of this design is Karp's algorithm for the Euclidean traveling salesman problem which we take up in Section 4. Such algorithms are fairly called probabilistic algorithms, but in the absence of internally generated random choices the best practice is to preserve the distinction made above and to avoid calling them randomized algorithms; though, admittedly, there is no reason to press for a rigid nomenclature.
The central aim of this chapter is to engage at least some aspect of each of the major roles for probability in the theory of network algorithms. When choices must be made, an emphasis is placed on those ideas one can expect to continue to be used and developed in the future. In Section 2 we engage the probability theory of network characteristics, where one mainly sees probability in either of the first two roles described above, as elements of either a physical or an idealized stochastic model. The section first develops the background for several inequalities that have evolved in the area of percolation theory. The FKG inequality is the best known and most widely used of these; but, as applications in percolation theory
Ch. 6. Probabilistic Networks and Network Algorithms 403

have shown, the much newer BK inequality is also an instrument that belongs in every tool kit. The second part of Section 2 then looks at the computational problems of associated with more physical models of networks. In Section 3 we engage randomized algorithms in the context of several problems of concern to the basic themes of network theory. The first paradigm discussed there is one initiated in Karp & Luby [19851, which remains essential to the current technology of randomized algorithms. In Section 4 we focus on problems of geometric network theory. This is the area of network theory that seems to have progressed most extensively from the viewpoint of probability theory, but it also offers practical algorithmic insights on issues that have been of interest and concern even before there evolved an extensive theory of algorithms. The classic problems here include the behavior of traveling salesman tours, minimum spanning and Steiner minimum trees, and matchings.

2. Probability theory of network characteristics

There are three substantial probabilistic theories with lives of their own, yet which are intimately intertwined with the probability theory of networks. The most immediate of these is network reliability. This subject provides extensive investigation of the problem of calculating and bounding the probability of the existence of (s, t)-paths. Because network reliability is dealt with in a separate chapter of this volume and because the book of Colbourn [19871 provides an extensive treatment, we do not give many details of the subject. Still, in many probabilistic investigations of networks, one needs to keep in mind the existence and highlights of the large body of results provided by reliability theory, and several of the results reviewed here owe their motivation to the concerns of network reliability theory. A second theory that is closely connected to the theory of random networks is the theory of random graphs, which deals extensively with questions like the existence of long paths, connectedness, the existence of cycles, and many other issues that are of importance to the theory of networks. Since BollobAs [1985] provides an extensive treatment of the theory of random graphs, we do not go deeply into that subject here. A third closely-related field is percolation theory, and in many ways this subject has a claim on being the deepest of the three related fields. It certainly has been pursued extensively by a large number of mathematicians and physicists over a number of years.

2.1. Tools from percolation theory

In this section, we first recall what the aims of percolation theory have been over the years of its development. We then suggest some ways in which the theory may help researchers who are concerned with questions that are more at the heart of network theory. We will develop three elementary but central tools of percolation theory: the FKG inequality, the 13K inequality, and Russo's formula. These powerful tools are the workhorses of percolation theory, yet they seem not
TL. SnYder, J.M. Steele

to be well known to researchers in the more general areas of stochastic network theory.
Percolation theory evolved from questions in physics that are themselves of many different flavors, including the magnetization of materials, the formation of crystals, the transport of electrons in special materials, sustenance of chemical reactions, and the flow of fluids. The latter offers perhaps the least compelling physics, but it provides the easiest metaphor and is often called upon for illustration.
We consider the classical d-dimensional rectangular lattice, Zd, and for each vertex v of the lattice we join v by an edge to each of its 2d nearest neighbors (in the sense of the usual Euclidean metric). If we use the traditional language of percolation theory, these edges are called 'bonds'. Bonds are viewed as being either 'open' or 'closed', and it is here where the probability modeling appears. To each bond e is associated an independent Bernoulli random variable Xe such that P (Xe = 1) = p for some fixed 0 < p :S 1. The bonds for which X, = 1 are regarded as being open, and the fundamental questions of the theory concern the components of lattice vertices connected by open edges. Among the main features that distinguish percolation theory from the theory of random graphs are the attention that is focused on subgraphs of the lattice and the interest that is focused on graphs with infinitely many edges.
A central quantity of interest in percolation theory is the percolation probability 0(p), defined as the probability that the origin is contained in an infinitely large connected component. One reason that 0(p) receives considerable attention is that it exhibits interesting critical phenomena that have close analogies with physical phenomena like the freezing of fluids. In particular, one can prove, for each dimension d, that there is a critical constant pc = pc(d) depending on the dimension d such that 0 (p) > 0 if p > pc but 0 (p) = 0 if p < p, The work of Kesten [1980] culminated the efforts of a great many investigations and established the long conjectured result that p,(2) = 1/2. This deep result required the development of techniques that would seem to offer useful insights for researchers in the theory of networks, and a well motivated exposition of Kesten's theorem can be found in Chapter 9 of Grimmett [1989].

The FKG inequality
The first tool we consider is named the FKG inequality, in respect of the work of Fortuin, Kasteleyn & Ginibre [19711. Even though we will not call on the full generality of their result, it is worth noting that the FKG inequality has a beautiful generalization due to Affiswede & Daylcin [19781, and the full-fledged FKG inequality has already found elegant applications to problems of interest in network theory. In particular, one should note the articles by Shepp [19821 and Graharn [19831.
The version of the inequality that we develop is actually a precursor of the FKG inequality due to Harris [19601, but Harris's inequality has the benefit of having very low overhead while still being able to convey the qualitative essence of its more sophisticated relatives. To provide a framework for Harris's inequality, we
Ch. 6. Probabilistic Networks and Network Algorithms 405

suppose that G is any graph and (X,} are identically distributed Bernoulli random variables associated with the edges of G. We think of the variables X, as labels marking the edges of G that would be regarded in percolation theory as open edges. The random variables of interest in Harris's inequality are those that can be obtained as monotone non-decreasing functions of the variables {X,}. In detail, if in a realization of the {X,} we change some of the (X,) that have value zero to have a value of one, then we require that the value of the function does not decrease. The classic example of such a variable is the indicator of an (s, t)-path of edges marked with ones. Harris's inequality confirms the intuitive fact that any pair X and Y of such monotone variables are positively correlated. Specifically, if X and Y are any non-decreasing random variables defined as functions of the edge variables X, of G, then one has

E(XY) ~: E(X)Effl.

This inequality is most often applied in the case of indicator functions. Since we will refer to this case later, we note that we can write Harris's inequality as

P(A n B) ~: P(A)P(B)

for all events A and B that are non-decreasing functions of the edge variables.
One can prove Harris's inequality rather easily by induction. If we write X = f071, 772, ..---?In) and Y = 9(771, Y12,..., 71J, where f and g are monotonic and the 177i} are independent Bernoulli random variables, then by conditioning on 77n, we see that it suffices to prove Harris's inequality just in the case of n = 1. In this case we see that, for q 1 - p,

EXY - EX. EY f (1)g(l)p + f (0)g(O)q - (f (1)p + f (0)q)(g(l)p + g(O)q) and since this factors as

pq If (1) - f (0)}{g(l) - 9(0)} > 0, we obtain Harris's inequality.
One of the nice consequences of Harris's inequality is the fact that if m non-decreasing events A l, A2- ., An with equal probability have a union with large probability, then, all the events Ai must have fairly large probability. This so-called 'square root trick' noted in Cox & Durrett [1988] formally says that for each 1 < i < m, we have

P(Ai) > 1 - (1 - P(U'~,Aj)}'1m. J=

The proof of this inequality requires just one line where Harris's inequality provides the central step:
m
1 - P(U,T=1Aj) =p(nT=1A,~) > 11 PK) = (1 - P(Affl'
j=1
406 TL. Snyder, J M. Steele

To appreciate the value of this inequality, one should note that without the assumption that the (Ai} are monotone, one could take the (Ai} to be a partition of the entire sample space, making the left side equal to 11m, while the right side equals one. We see therefore that the FKG inequality helps us extract an important feature of monotone events. As a point of comparison with a more combinatorial result, one should note that the square root trick and the local LYM inequality of BollobAs and Thomason [cf. BollobAs, 19861 both address the way in which probabilities of non-decreasing sets (and their ideals) can evolve. For further results that call on the FKG and Harris inequalities one should consult Graham [1983] and Spencer [19931.

The BK inequality
The insights provided by the FKG and its sibling inequalities are valuable, but they are limited. The inequalities often just provide rigorous confirmation of intuitive results that one can justify by several means. A much deeper problem arises when one needs an inequality that goes in a direction opposite that of the FKG inequality. For this problem, the progress is much more recent and less well known.
As one can show by considering any dependent, non-decreasing events A and B, there is no hope of simply reversing the FKG inequality. In fact, the same examples can show that additional assumptions on A and B that fall short of independence are of no help, so some sort of additional structure, or some modification is needed for AnB. Van den Berg & Kesten [19851 discovered that the key to a useful reversal of the FKG inequality rests on a strengthening of the notion of AnB. The essence of their idea is that the event of A and B both occurring needs to be replaced with that of 'A and B both occurring, but for different reasons' or, as we will shortly define, A and B occurring disjointly.
The technical definition of disjoint occurrence takes some work, but it is guided by a canonical example. If A corresponds to the existence of an (s, t)-path and B corresponds to the existence of an V, t')-path, then A n B needs to be replaced by the event corresponding to the existence of (s, t)- and (s', t)-paths that have no edge in common. To make this precise in a generally applicable way, we have to be explicit about the underlying probability space. To keep ourselves from straying too far from network applications, we let S2 denote the set of (0, 1)-vectors (Xl,X2,...,X,), where m is the number of elements in a set S of edges that are sufficient to determine the occurrence of A. In many problems m cannot be bounded by anything sharper than the number of edges of G, but the bound can be useful even in such cases.
We define a measure on 9 via the Bernoulli edge variables X, taken in some fixed order, so S2 taken with our probability measure P give us a product measure space {S2, P}. We now define the set A o B, the disjoint occurrence of non-decreasing events A and B, as follows:

A o B w : there exists 60a E A and 0)b E B such that COa ' 0)b = 0, and o) > co,, and co > 0)b
Ch. 6. Probabilistic Networks and Network Algorithms 407

Here, we use o),, ' 0)b to denote the usual inner product between vectors, so the combinatorial meaning of the last condition is that wa and 60b share no l's in their representation. In other words, for non-decreasing events A and B, &),, and 0)b are able to bear respective witness that A and B occur, but they can base their testimony on disjoint sets of edges.

The 13K Inequality. If A and B are non-decreasing events in (2, P}, then

P(A c B):5 P(A)P(B).

The systematic use of the 13K inequality is just now becoming widespread even in percolation theory proper. In Grimmett [19891 one finds many proofs of older results of percolation theory that are rendered much simpler via the 13K inequality.

Russo's formula
The last of the percolation theory tools that we will review is a formula due to Russo [1981] that tells how the probability of a non-decreasing event changes as one changes the probability of the events {X, = 1}. To state the formula, we suppose as before that we have a graph G with edges that are 'open' with probability p in such a way that the indicator variables X, are independent and identically distributed. In this context we will require that G is finite, and, to emphasize the use of p as a parameter, we will denote the governing probability measure by Pp.
Now, if A is any non-decreasing event, we introduce a new random variable NA that we call 'the number of edges that are pivotal for A. Formally, we define NA (&)) as follows: (a) If o) ~ A, then NA (co) is zero, and (b) if o) E A, then NA(w) equals the number of edges e such that, in the representation of 0) as a (0, 1)-vector of edge indicators oi (Xl, X2.... 1 XM), we have x, = 1, but, if we change x. to 0 to get a new vector w, then co' J A. In the latter case, we say that e ispivotal for A.

Russo's formula. If A is any non-decreasing event defined on the Bemoulli process associated with a finite graph G, and if NA denotes the number of edges that are pivotalfor A, then

d Pp(A) = Ep(NA).
dp

This beautiful and intuitive formula can be used in many ways, but it is often applied to show that Pp (A) cannot increase too rapidly as p increases. To see how one such bound can be obtained in a crude but general context, we first note that the differential equation of Russo's formula can be rewritten in integrated form for 0 < p, < P2 < 1 as
P2
Pp, (A) = Pp, (A) exp (f 1 Ep (NA 1 A) dp). p P1
408 TL Ayder, JIM Steele

If there is a set S = fel, e2, e,} of m edges such that the occurrence of A can always be determined by knowledge of S, then the integral representation and the trivial bound

Pp (e is pivotal for A 1 A) :5 1

provide a general inequality that bounds the rate of growth of Pp (A) as a function of p:

PP2(A) Pp~(A).
\Pil

2.2. Distributionalproblems of random networks

In percolation theory the random variables associated with the edges are invariably Bernoulli, but in the network models that aim to model physical systems the network ingredients often are modeled by random variables with more general distributions, and the central questions in such models concern the distributions of larger network characteristics. Sadly, many of these distributional questions are analytically infeasible. Moreover, in many cases of practical interest these same questions are computationa'lly intractable as well. We will illustrate some of the technology that has been developed for such problems by considering the problem of determining the distribution of the minimum-weight path from source to sink in a network with random edge weights.

Calculation of the distribution of the shortestpaths
Formally, we let G = (V, E) be an acyclic network with source vertex s and sink t, where edge weights are represented by independent random variables W, for all e E E. The stochastic quantity of interest is the distribution of the random variable L(G), denoting the length of a shortest (s, t)-path in G. Valiant [19791 showed that the problem of determining the distribution of L(G) is in general NP-hard, so at a minimum one must look to approximation methods. One natural approach to the distribution problem is to try to exploit the independence of the edge weights through the use of cut sets. This idea forms the basis of the simulation method of Sigal, Pritsker & Solberg [1979, 1980]. To describe their method for building a simulation estimate for P(L(G) ~: t), we first let C = el, e2, ..., ek be an exact cut in G, that is, we take C to be a set of edges such that every (s, t)-path in G shares exactly one edge with C. Such a cut always exists, and it offers us a natural method for exploiting the independence of the X,. The key observation is that the edges of C induce a natural partition of the (s, t)-paths of G.
For each 1 < i < k and each ei E C we let pi be the set of all (s, t)-paths that contain ei. Now, for any t E R, we consider the random variable defined by the conditional probability

R = P(L(G) ~: t 1 W, e E E - C).
Ch. 6. Probabilistic Networks and Network Algorithms 409

Since R satisfies ER = P(L(G) ~: t), if we let r be the sample value of R based on a realization {w,} of {W, : e E E - C}, then by independence we have

r =P(L(G)>tIw, eEE-C)
)D ( E We + Wei ~: t for all p E pi and for all ei E C
eEP
e:hei

W,i > t - min We)
( PEA ( 1:
i=1 eEP
e56ei

Since the right hand side can be computed from the known distribution of the We,, an estimate of P (L (G) ~: t) is given by n - 11: 1 This method is reasonably crude, but with proper implementation it can provide answers in some situations of importance. Certainly, the computation of liriinpEpi (E eEp we) for 1 < i < k should not be conducted naively since there
eoei
can be an exponential number of (s, t)-paths in G. There are even moderately sized networks for which an exhaustive evaluation of the required sums is computationally prohibitive. One does much better to note that once the edges in E - C have been sampled, the lengths of the shortest paths from s to ei and from ei to t can be computed for each i by using an appropriate deterministic single-source shortest path algorithm, such as that of Dijkstra [19591, or more recent refinements. Dijkstra's algorithm is easy to implement, has low overhead, and takes 0 (1 V 12) steps in the worst case to compute all the required path lengths. Having the lengths of the shortest paths from s to the ei and from the ei to t allows the computations of the minima in the representation for r to be obtained in at most 0 (1 C 12) additional steps.
The problem of estimating the distribution of L(G) offers a typical instance of the trade-off one often meets in simulation estimations. First, there is a desire to have an efficient estimate of ER for which we would like a cut that provides for a low variance of R. This is a kind of efficiency that helps us minimize the number of independent realizations one must take in the simulation. Second, we would like to have efficiency in the computation of the estimate in the sense of computing the shortest (s, ei)- and (ei, t)-paths. The trade-off that faces us is that as ICI increases, the variance of R decreases, but as the cut size increases so does the of cost of computing shortest paths and minima. There are often many different exact cuts on which one can base the simulation estimation of P (L (G) ~: t) and the proper choice of the cut is an important design consi
410 TL. Shyder, J.M. Steele

Other distribution problems of random networks
Other studies have undertaken the difficult task of determining distributions of flows in networks with random capacities. Among these is the paper of Grimmett & Welsh [19821, which considered maximum flows in networks with independent and identically distributed capacities. Grimmett and Welsh found limit theorems for the cases where the networks are either complete graphs or branching trees. In subsequent work, Frieze & Grimmett [19851 looked at the shortest path problem under general independent models, and Kulkarni [1986] studied shortest paths in networks with exponentially distributed edge lengths. One point that emerges from these works is that the probability theory of network characteristics offers many individual problems of considerable challenge. So far there seems to have not been any attempt at providing the framework for the general theory of such characteristics. With the insights of several special problems in hand, perhaps it is time that work on a more general investigation was begun.

3. Probabilistic network algorithms

In this section we first provide an introduction to a general approach of Karp & Luby [19851 for the design of randomized algorithms. We then illustrate their method by showing how one can put the problem estimation of multiterminal network reliability into their framework. We then review briefly two recent randomized algorithms for maximum network flow. Finally, we review some of the work on randomized algorithms for perfect matching and maximal matching in graphs.

3.1. Karp-Luby structures for randomized algorithms

Karp & Luby [19851 provided a framework for randomized algorithms that is useful in a broad range of applications and which specifically offers an effective approach to some problems of network reliability. Their approach begins abstractly with a set S and a weight function a : S -). R+ which we then use to provide a weight for any A C S by taking a(A) = ExcA a(x). Clearly there are many important problems that can be framed in terms of the calculation of a (A) for appropriate choices of a, S, and A; but, as one must suspect, we will have to impose some additional structures before this framework can show its value. We call (S, R, a) a Karp-Luby Monte Carlo structure if R c S and we have the following three properties: (1) the 'total weight' a (S) is known, (2) there is a 'sampling algorithm' that selects an item x at random from S according to the probability a(x)la(S) with independent selections at each invocation of the sampling algorithm, and (3) there is a 'recognition algorithm' that can test if a given element x of S is also an element of R.
For any such structure (S, R, a) one can estimate the weight a (R) in the most straightforward way imaginable. One just selects n independent random elements Xl, X2, ... Xn of S by the sampling algorithm. Letting Yi be 1 or 0
Ch. 6 Probabilistic Networks and Network Algorithms 411

accordingly as Xi E R or not, one then takes as an estimator of a(R) the value Y = a(S)(Y1 + Y2 +---Y.)1n. As a consequence of the traditional Bernstein tail estimates of the binomial distribution, for any E > 0 and 3 > 0 we have

IY - a(R)l E) < 8
a(R) provided that

n > 9 6-2 log (2) a(S)
5 3 a(R)

The punch line here is that once we are able to frame a problem in terms of a Karp-Luby structure, we can determine a S-,- approximation in the sense of the preceding probability bound. Moreover, we can bound the expected computational cost of the algorithm by a polynomial in the parameters 6-1, log(l/S), and the sensitivity ratio a(S)Ia(R).

3.2. Karp-Luby structures for network reliability

The multiterminal network reliability problem is a stylized model for communication reliability that has been studied from many perspectives, and it offers a good example of how one fits a natural problem into the framework of Karp-Luby structures. Given a connected graph G = (V, E) and a special set of 'terminal vertices' T = Itl, t2, ..., tk) C V , the motivating issue of multiterminal network reliability is to model the possibility of communication between the elements of T under random degradation of the network links. The probability modeling calls for a function p : E --->. [0, 1] that is viewed as giving for each e E E the probability p (e) that the edge e is 'bad.' Under the assumption that the edges are made good or bad according to independent choices governed by p, the key problem is to determine the probability that for all pairs of elements of the set of terminals there is a path between them that consists only of good edges.
More formally, we consider the set of all mappings s : E -). 10, 1} as the elements of our probability space, and we take the interpretation of this function as an assignment of a label of 0 on the good edges and 1 on the bad edges. The probability of a specific state s thus is given by P(S) = fleEE p(e)s(e)(1 p(e))'-s('). The computational challenge is to calculate the probability that there is some pair of terminal vertices for which there does not exist a path between them in the graph consisting of the vertex set V and the set of all edges of G which are labeled 'good'. We call a state for which this event occurs a failing state, and we let F denote the set of all states s which have failure.
To provide a Karp-Luby structure so that we can use the strategy discussed in the preceding section, we first need the notion of a canonical cut. Let S E F be any failing state, and let G (s) = (V, E (s)) where E (s) is the set of good edges for the state s. For any 1 < i < k we then let Cj (s) denote the connected component of G (s) that contains the terminal ti. Since s E F there is some i for which Cj (s) is not all of G and, further, because of the assumption that the full graph
412 TL. Shyder, JIM Steele

G = (V, E) is connected, there is at least one such Ci(s) for which the graph induced by the removal of all the vertices of Cj (s) from G = (V, E) is connected. We let i * (s) denote the least such index i, and finally we let g (s) denote the set of edges that have exactly one endpoint in Ci. (s). The set g (s) is a T~cut in that it separates two terminals of T in the graph G (s) = (V, E (s)), and we call g (s) the canonical cut for the state s.
We now have the machinery to specify the Karp-Luby structure for the multiterminal reliability problem. Let S be the set of all pairs (c, s) where S E F is a failing state and c is a T-cut for which each edge of c fails in state s. The weight function associated with a pair (c, s) E S is taken to be the probability of the state s, so a((c, s)) = P(s). Although this weight function ignores the first component of (c, s), the presence of the first component turns out to be an essential in providing an effective sampling process. This choice of a and S permits us to write down a simple formula for a(S). Since a(S) is equal to the sum of all the probabilities of the states s where s fails for the cut c, we have
a (S) = Y-' a (c, s) = E 11 p (e),
(C'S)ES c eEc

where the last sum is over all T-separating cut sets of G = (V, E). The target set R is given by the set of all pairs (g(s), s) where s E F, and (S, R, a) will serve as our candidate for a Karp-Luby structure for the multiterminal network problem. To see the interest in this triple we first note that
a (R) Y7 a (g (s), s) = 1: P (s),
sEF sEF

so the weight a(R) corresponds precisely to the probability of interest.
For the effective use of (S, R, a), it would be handiest if we had at our disposal a list L of all the T-separating cut sets of G = (V, E). When the list is not too large, the formula given above provides a way to calculate a (S). Similarly, we also have at hand an easy way to test if S E F by examining the failure of each of the cuts. To complete our check that the (S, R, a) leads to a Karp-Luby structure in this nice case, it only remains to check that sampling from S is not difficult.
To choose an element (C, S) E S according to the required distribution, we first choose at random a C E L according to the probability distribution RE, p(e)la(S). We then select a state function s such that s(e) = 1 for all e E c and by letting s (e) = 1 or s (e) = 0 with probability p (e) or 1 - p (e), respectively.
We have completed the verification that (S, R, a) satisfies the constraints required of a Karp-Luby structure, but for it to serve as the basis for an effective randomized algorithm we also need to have a bound on the sensitivity ratio a(S)la(R). In many multiterminal reliability problems a sufficiently powerful bound is provided by the following inequality of Karp & Luby [1985]:

a(S) fl (1 + p (e)).
a(R) er=E

1
Ch. 6 Probabilistic Networks and Network Algorithms 413

Thus far we have given a reasonably detailed view of the Karp-Luby structure and how it can be applied to a problem of computational interest in network theory. The development recalled here so far has the shortfalling that it seems to require an explicit list of the T-cuts of the network, and that list must be reasonably short. Karp & Luby [19851 go further and show that there are cases where this requirement can be avoided; in particular they show that if G is a planar graph, then the program still succeeds even without explicitly listing the cut sets.

3.3. Randomized max-flow algorithms

The theory of network flows is to many people what the theory of networks is all about, and there are two recent contribution of randomized algorithms to this important topic that have to be mentioned here, even though this survey cannot dig deeply enough into them to do real justice. The first of these is the algorithm of Cheriyan & Hagerup [1989] for maximum flow in the context where all the arc capacities are deterministic. The Cheriyan-Hagerup algorithm produces a maximum flow for any (non-random) single-source, single-sink input network. The algorithm takes 0 (1 V 11 E 1 log 1 V 1) time in the worst case, although this happens with probability no more than V,_,1V12, where a is any constant. Most important is that the Cheriyan-Hagerup algorithm takes 0 (1 V 1 E 1 + 1 V 12 (log 1 V 1)3 ) expected time, which, being 0 (1 V 11 E 1) for all but relatively sparse networks, compares favorably with all known strongly polynomial algorithms. The algorithm is also strongly polynomial in the sense that the running time bound does not depend on the edge-capacity data.
The Cheriyan-Hagerup algorithm builds on some of the best deterministic algorithms and takes a step forward by introducing randomization at key stages. The algorithm calls on scaling techniques in the spirit of Gabow [1985], Goldberg & Tarjan [19881, and Ahuja & Orlin [19871 and also employs pre-push labeling, another device of the Goldberg and Tarjan max-flow algorithm [cf. Ahuja, Magnanti & Orlin, 19911. The randomness of the Cheriyan and Hagerup algorithm arises in how the network is represented at a given moment during the course of the algorithm. The model used for network representation is the adjacency list model, in which the neighbors of each V E V are maintained by a list associated with v. One of the key ideas of the Cheriyan-Hagerup algorithm is to randomly permute each adjacency list at the outset of the algorithm, then randomly permute the adjacency list of vertex v whenever the label of v is updated. The net effect of the permutation is to lower the expected number of relabeling events that the algorithm must carry out during each phase, lowering the expected running time. One further interesting aspect of the Cheriyan-Hagerup algorithm is that Alon [1990] has provided a device that derandomizes the algorithm in a way that is effective for a large class of graphs.
A more recent contribution of a randomized algorithm for max-flow has been provided in Karp, Motwani & Nisan [1993]. Given a realization of an undirected network with independent identically distributed random capacities, the algorithm finds a network flow that is equal in value to the optimum flow value with high
414 TL. ShYder, J.M. Steele

probability. The algorithm runs in linear time, which is significantly faster than the best known algorithms that are guaranteed to find an optimal flow.
The algorithm of Karp, Motwani, and Nisan is not simple, but at least some flavor for the design can be appreciated independently of the details. In the first stage of the algorithm, the max-flow problem on G is transformed to an instance of a probabilistic version of the transportation problem. The instance of the transportation problem is constructed so that its solution flow is forced to yield (1) a maximum flow that can be immediately transformed to a max-flow in G and (2) a flow that saturates the (S, V - S) cut in G, where S is the set of sources. The second stage of the max-flow algorithm is a routine that attempts to solve the transportation problem. Here Karp, Motwani & Nisan [19931 introduce their so-called mimicking method which they outline in four steps: (1) before considering the realization of the random graph, consider instead the graph formed by replacing each random variable Xi with EXi; (2) solve the resulting deterministic problem; (3) consider now the realization of the random graph, and attempt to solve the problem by 'mimicking' the solution from (2); and (4) fine-tune the result to get the optimum solution. Even though these steps have engaging and evocative descriptions, there is devil in the details which in the end leads to delicate analyses for which we must refer to the original.

3.4. Matching algodthms of severalflavors

Information about matchings has a useful role in many aspects of the theory of networks. Moreover, some of most effective randomized algorithms are those for matching, so this survey owes the reader at least a brief look at randomized matchings for algorithms and related ideas.
The key observation of LovAsz [19791 was that one can use randomization to test effectively for the positivity of a determinant, and this test can be used in turn to test for the existence of a perfect matching in a graph. To sketch the components of the method we first recall that with the graph G = (V, E) we can associate an adjacency matrix D by taking dij = 1 if (i, j) E E and zero otherwise. From the adjacency matrix we can construct the ntte matrix T for G by replacing the above-diagonal elements dij by the indeterminants xij and the below-diagonal elements dij by the indeterminants -xij. The construction of T is completed by putting zeros along the diagonal. The theorem of lbtte, for which he introduced this matrix, is that G has a perfect matching if and only if det T 0 0.
The core of the idea for testing if G has a perfect matching is then quite simple. One chooses random numbers for the values of the xij and then computes the determinant numerically, a process that is not more computationally difficult than matrix inversion. The savings come here from the fact that the determinant in the indeterminant variables xij can have exponentially many terms, but to test that the polynomial is not identically zero we only have to see that it is non-zero at a point. Naturally, to be true to the values of honest computational complexity theory one cannot rely on computation with real numbers, but by working over a finite field one comes quickly to the conclusion that there is merit to the idea.
Ch. 6. Probabilistic Networks and Network Algorithms 415

LovAsz [19791 generalized Tutte's theorem and went on to provide an algorithm that takes advantage of the idea just outlined in order to find maximal matchings in a general graph. Rabin and Vazirani [1989] pressed this idea further and provided an algorithm that is faster than that of Lovdsz. A computational virtue of both the LovAsz and Rabin-Vazirani algorithms is that they are readily implemented as parallel algorithms.
Another development from the theory of matching that has had wide-ranging impact on the theory of combinatorial algorithms is the introduction of the method of rapidly mixing Markov chains. The development evolving from Jerrum & Sinclair [1986, 1989] calls on the idea that if one runs a Markov chain for a long time, then its location in the state space is well approximated by the stationary distribution of the Markov chain. This idea can be used to estimate the number of elements in a complicated set, say, the set of all matchings on a graph, if one can find a chain on a set of states that includes the set of matchings and for which a Markov chain can be constructed that converges rapidly to stationarity. This idea has undergone an extensive development over the last few years. For a survey of this field we defer to the recent volume of Sinclair [19931.
The final observation about matching in random graphs that deserves space in the awareness of researchers in network theory is that algorithms based on augmenting paths are likely to perform much better than their worst-case measures of performance would indicate. These algorithms, which exhibit the fastest worst-case running times, are also fast in expectation, sometimes out performing even the best heuristic algorithms. Many of the algorithms, including the algorithms of Even & Kariv [19751 and Micali & Vazirani [19801, run in linear expected time if the input graph is chosen uniformly from the set of all graphs. The reason behind this observation seems to be the expander properties of random graphs and the fact that in expander graphs one has a short path connecting any two typical points [cf. Motwani, 19891.
The proofs of these results come from an analysis of the lengths of augmenting paths. It is shown that, with high probability, every non-perfect matching in a random graph has augmenting paths that are relatively short. Since the bulk of augmenting path algorithms is spent carrying out augmentations, bounds on the lengths of augmenting paths translate to fast running times.

4. Geometric networks

One of the first studied and most developed parts of the theory of networks concerns networks that are embedded in Euclidean space. A geometric network is defined by the finite point set S C Rd, with d > 2, and an associated graph, which is usually assumed to be the complete graph on S. The costs associated with the edges of the graph are typically the natural Euclidean lengths, though sometimes it is useful to consider functions of such lengths, for example, to take the cost of an edge to equal the square of its length. The central questions of the theory of geometric networks focus on the lengths of subgraphs; so for example, in the
416 TL Shyder, J.M. Steele

traveling salesman problem, we are concerned with the length of the shortest tour through the points of S. Also of central interest in this development is the theory of minimum spanning trees, Steiner trees, and several types of matchings. The key result in initiating the probabilistic aspects of this developments is the classic Beardwood, Halton, and Hammersley theorem.

Theorem [Beardwood, Halton & Hammersley, 1959]. If Xi, 1 < i < 00 are independently and identically distributed random variables with bounded support in Rd, then the length L,, under the usual Euclidean metric of the shortest path through thepoints (Xl, X2-., X,,} satisfies

finP,d f (X)(d-1)1d dx almost surely.
-1)1d
n(d f
Rd

Here, f (x) is the density of the absolutely continuous part of the distribution of the Xi.

In addition to leading to algorithmic applications, the Beardwood, Halton, and Hammersley (BH11) theorem has led to effective generalizations, as well as new analytical tools. In this section, we review these tools, including the theory of subadditive Euclidean functionals, bounds on tail probabilities, related results in the theory of worst-case growth rates, and bounds on limit constants such as PTSP,d.
One elementary point that may help avoid confusion in the limit theory offered by the Beardwood, Halton, Hammersley theorem is the observation that it is of a much deeper character than E)(n(d-l)ld) results for L., which only require that there exist positive constants a and b such that an(d-l)ld < Ln < Wd-Wd. The latter results are sometimes useful, but they are almost trivial in comparison, unless one presses for very good values for a and b. The stronger asymptotic result
(d-l)ld
that LnIn converges to a constant requires entirely different techniques and
typically leads to much different applications.
A second comment concerns uses to which one can put results such as the Beardwood, Halton, Hammersley theorem and its relatives. The use of the BHH theorem in the polynomial-time probabilistic TSP algorithm of Karp [1976, 19771 is one of the primary reasons results like the BHH theorem are studied today. Part of the charm of TSP is that it is NP -hard, and it has been studied from many heuristic and approximation perspectives. Karp's algorithm has a special place in the theory of algorithms because given any 6 > 0, its expected running time is almost linear and with probability one it produces a tour of length no more than (1 + o5) times the optimal tour length. Karp's algorithm played an important role in launching the field of probabilistic algorithms, and it certainly stimulated interest in the development of theorems that extend that of Beardwood, Halton, and Hammersley. Since then, theorems like the BHH theorem have been proved for other quantities, like the length of the minimum spanning and Steiner minimum trees, greedy and semi-matchings, and others. For further information on some
Ch. 6 Probabilistic Networks and Network Algorithms 417

of these developments one can consult Halton & Terada [1982], Karp & Steele [1985], or Steele[l.990a, bl.

4. 1. Subadditive Euclidean functionals and non-linear growth

The length of the traveling salesman tour has a few basic properties that are shared with a large number of problems of combinatorial optimization in Euclidean space. By abstracting some of the simplest of these properties, it is possible to suggest a very general result that provides information comparable to that given by the Beardwood, Halton, Hammersley theorem.
Let L be a function that maps the collection of finite subsets {Xl, X2, X.} C Rd to the real numbers R. To spell out the most innocent properties of L that mimic the behavior of the TSP, we first note that for the TSP, L exhibits homogeneity and translation invariance, i.e.,

L(axi, 01X2,..., C1Xn) = ctL(xl, X2, ..., xn) for all a > 0, (4.1)
and

L(xl+Y,X2+Y,...,xn+Y)=L(xl,X2,...,xn)forallyERd. (4.2)

The TSP's total length also has some strong smoothness and regularity properties, but these turn out not to be of essential importance, and for the generalization we consider we will need to call on the smoothness of L only to require that, for each n, the function L viewed as a function from Rnd to R is Borel measurable. This condition is almost always trivial to obtain, but it is nevertheless necessary in order to be able to talk honestly about probabilities involving L.
Functions on the finite subsets of Rd that are measurable in the sense just described and that are homogeneous of order one and translation invariant are called Euclidean functionals. These three properties are somewhat bland, and one should not expect to be able to prove much in such a limited context, but with the addition of just a couple other structural features, one finds a rich and useful theory.
The first additional property of the TSP functional that we consider is that it is monotone in the sense that
L(xl, X2, ..., Xn) :s L(xl, X2,.- Xn, Xn+l). for n > 1, and L(95) = 0.
(4.3)

A second additional and final feature of the TSP functional that we abstract is the only truly substantial one. It expresses both the geometry of the space in which we work and the fundamental suboptimality of one of the most natural TSP heuristics, the partitioning heuristic. The subadditive property we require is that there exists a constant B such that

md
L({xl, X2,..., Xn} n [0, t]d) :S L({xl, X2,..., xn} n Qi) + Btrnd-1
(4.4)
418 TL. Snyder, j.M. Steele

for all integers m > 1 and real t > 0, where {Qi}tnd is a partition of [0, t]d
i=1 into generally smaller cubes of edge length tIm.
Euclidean functionals that satisfy the last two assumptions will be called monotone subadditive Euclidean functionals. This class of processes seems to abstract the most essential features of the TSP that are needed for an effective asymptotic analysis of the functional applied to finite samples of independent random d
variables with values in R
To see how subadditive Euclidean functionals arise naturally and to see how some closely-related problems can just barely elude this framework, it is useful to consider two basic examples in addition to the TSR The first is the Steiner minimum tree, which is a monotone subadditive Euclidean functional. For any finite set S = (Xl, X2, . .., Xn} c R d , a Steiner minimum tree for S is a tree T whose vertex set contains S such that the sum of the lengths of the edges in T is minimal over all such trees. Note that the vertex set of T may contain points not in S; these are called Steiner points. If L ST(Xl, X2, . ----Xn) is the length of a Steiner tree OfXl, X2-., xn and if we let 1(e) be the length of an edge e, another way of defining LST is just

L ST (S) = min E 1 (e) : T is a tree containing S C Rd, S finite
T 1 eET 1
A closely-related example points out that the innocuous monotonicity property of the TSP and Steiner minimum tree can fail in quite natural problems. The example we have in mind is the minimum spanning tree. For {Xl, X2, Xn} C
Rd
' let LMST(xl, X2-., Xn) = min LIET 1(e), where the minimum is over all spanning trees Of {Xl,X2,...,x,,}. The functional LMST is easily seen to be homogeneous, translation invariant, and properly measurable; one can also check without much trouble that it is subadditive in the sense required above. Still, by

considering the sets S = {(0, 0), (0, 2), (2, 0), (2, 2)l and S U (Q, 1)}, we see that
LMST fails to be monotone as required. One should suspect that this failure is of an exceptional sort that should not have great influence on asymptotic behavior, and it can be shown that this suspicion is justified. The example, however, puts us on warning that non-monotone functionals can require delicate considerations that are not needed in cases that mimic the TSP more closely.
Subject to a modest moment condition, the properties (4.1) through (4.4) are sufficient to determine the asymptotic behavior of L(Xl, X2, ..., XJ, where the Xi are independent and identically distributed.

Theorem 1 [Steele, 1981a]. Let L be a monotone subadditive Euclidean functional. If {Xi}, i = 1, 2, . . , are independent random variables with the uniform distribution on [0, 1]d . and Var L (Xl, X2, ., XJ < oo for each n > 1, then as n---* oo

L (X,, X2, .. ., X.) PL,d
n(d-l)ld with probability one, where PL,d ~: 0 is a constant depending only on L and d.
Ch. 6. Probabilistic Networks and Network Algorithms 419

The restrictions that this theorem imposes on a Euclidean functional are as few as one can reasonably expect to yield a generally useful limit theorem, and because of this generality the restriction to uniformly distributed random variables is palatable. Moreover, since many of the probabilistic models studied in operations research and computer science also focus on the uniformly distributed case, the theorem has immediate applications. Still, one cannot be long content with a theory confined to uniformly distributed random variables. Fortunately, with the addition of just a couple of additional constraints, the limit theory of subadditive Euclidean functionals can be extended to quite generally distributed variables.

4.2 Tailprobabilities for the TSP and otherfunctionals

The theory just outlined has a number of extensions and refinements. The first of these that we consider is the work of Rhee & Talagrand [19891 on the behavior of the tail probabilities of the TSP and related functionals under the model of independent uniformly distributed random variables in the unit d-cube. In Steele [1981b], it was observed that Var L,, for d = 2 is bounded independently of n. This somewhat surprising result motivated the study of more detailed understanding of the tail probabilities P(L,, > t), particularly the issue of determining if these probabilities decayed at the Gaussian rate exp(-cx2/2). After introducing new methods from martingale theory and interpolation theory which led to interesting intermediate results, Rhee & Talagrand [19891 provided a remarkable proof that in d = 2, the TSP and many related functionals indeed have Gaussian tail bounds. The formal result can be stated as follows.

Theorem [Rhee & Talagrand, 19891. Let f be a Borel measurable function that assigns to each finite subset F C [0, 112 a real value f (F) such that

f (F) :s f (F U x) < f (F) + min(d(x, y) : y E F).

If Xi are independent and uniformly distributed in [0, 112 . then the random variable defined by U,, = f ({Xl, X2, ..., Xn}) is such that there exists a constant K for which, for all t > Q

P(IU. - E(UJ1 > t) < exp -t2
K

4.3. Worst-case asymptotics

The probabilistic rates of growth just surveyed are replicated in worst-case settings. In this section, we survey some of the work that has been done on worst-case growth rates and draw parallels with the probabilistic rates.
Let 1(e) be the usual Euclidean length I e I of the edge e. As a primary example of a worst-case growth rate, consider the worst-case length of an optimal traveling
420 TL. Shyder, J.M Steele
salesman tour in the unit d-cube:

pTsp(n) = max min El(e) : T is a tour of S (4.5)
SC[0,1]d T 1 er=T
ISI=n

In words, pTsp(n) is the maximum length, over all point sets in [0, 1] d' that an optimal traveling salesman tour can attain. The minimized quantity in (4.5) is just the length of an optimal traveling salesman tour of the point set S, and the maximum is taken over all point sets S of size n. We note that there is no probability theory here, for the point sets and tours are deterministic. Steele & Spyder [19891 showed that, despite this, one obtains a rate of growth for PTsp that is identical to the probabilistic growth rate in Theorem 1.

Theorem 3 [Steele & Snyder, 19891. As n ----> oc,

PTSP (n) - aTSP,d n (d-l)ld 1 where CITSPA > 0 is a constant depending only on the dimension d.

4.4. Progress on the constants

Estimation of the limiting constants has a long history in both the worst-case and stochastic contexts. Few [19551 improved some very early-work to provide the bound aTSP,2 :s /2- and gave a dimension-d bound Of aTSP,d :5 d12(d _1)}(l-d)/2d, where d > 2. After other intermediate work, Karloff [19891 broke the ~,/2- barrier in dimension two by showing that UTSP,2 :5 0.984-45. The best bounds currently available in higher dimensions are those of Goddyn [1990].
Bounds'on the worst-case constants are also available for other Euclidean network problems. Of particular note is the bound on URST,d, the constant associated with the worst-case length of a rectilinear Steiner minimum tree in the unit d-cube. Chung & Graham [1981] proved that CIRST,2 = 1, which is significant in that it is the only non-trivial worst-case constant for which we have an exact expression. The problem of determining aRST,d in higher dimensions is still open, with the current best-known bounds being max(l., d1(4e)} :5 aRST,d :5 dO-d)1d, for d ~~: 1 [Snyder, 1991, 1992; Salowe, 19921.
In the case of the probabilistic models, there is recent progress due to Bertsimas and van Ryzin [19901, where asymptotic expressions as d gets large were obtained for the probabilistic minimum spanning tree and matching constants 8MST d and ,8M,d. Specifically, they showed that PMST,d - -,,1d--12xe and PM,d - (112):~d-127re as d -->. oc. Still, the most striking progress on probabilistic constants was the determination of an exact expression for OMST,d for all d > 2 by Avram & Bertsimas [1992]. Their expression for PMST,d comes in the form of series expansion in which each term requires a rather difficult integration. The representation is still an effective one, and the first few terms of the series in dimension two have been computed to yield a numerical lower bound Of OMST,2 ~~: 0.599, which agrees well

i
Ch. 6 Probabilistic Networks and Network Algorithms 421

with experimental data. The proof of the series representation for OMST,d relies strongly on the fact that a greedy construction is guaranteed to yield an MST, and unfortunately these constructions are not possible for many objects of interest, including the TSR

5. Concluding remarks

The theory of probabilistic networks and associated algorithms is rapidly evolving, but it is not yet a well consolidated field of inquiry. In surveying the literature, one finds relevant contributions growing in many separate areas, including the theory of random graphs, subadditive Euchdean functionals, stochastic networks, reliability, percolation, and computational geometry. Many of these areas make systematic use of tools and methodologies that remain almost unknown to the other areas, despite compelling relevance. The aim here has been to provide a view of part of the cross-fertilization that seems possible, but of necessity our focus has been on topics that allow for reasonably brief or self-contained description. Surely one can - and should - go much further. For more general information concerning probability theory applied to algorithms, one can consult the surveys of Karp [1977, 19911, Rinnooy Kan [19871, Hofri [1987], and Stougie [19901, as well as the bibliography of Karp, Lenstra, McDiarmid, and Rinnooy Kan [19851. For more on percolation theory, Grimmett [1989] is a definitive reference.'

Acknowledgements

This research was supported in part by Georgetown University 1991 Summer Research Award, by Georgetown College John R. Kennedy, Jr. Faculty Research Fund, and by the following grants: NSF DMS92-11634, NSA MDA904-91-H-2034, AFOSR-91-0259, and DAAL03-89-G-0092.

References

Ahiswede, R., and D.E. Daykin (1978). An inequality for the weights of two families of sets, their unions and intersections. Z. Wahrscheinlichkeitstheor Venv. Geb. 43, 183-185.
Ahuia, RX, TL. Magnanti and LB. Orlin (1991). Some recent advances in network flows. SL4M Rev. 33, 175-219.
Ahuja, RX., and LB. Orlin (1987). A fast and simple algorithm for the maximum flow problem. Oper Res. 37, 748-759.
Alon, N. (1990). Generating pseudo-random permutations and maximum flow algorithms. Inf Process. Lett. 35, 201~204.
Avram, E, and D. Bertsimas (1992). The minimum spanning tree constant in geometric probability and under the independent model: a unified approach. Ann. Appl. Probab. 2, 118-130.
Beardwood, L, J.H. Halton and J. Hammersley (1959). The shortest path through many points. Proc. Camb. Philos. Soc. 55, 299-327.
422 TL. Shyder, J.M Steele

Bertsimas, D., and G. Van Ryzin (1990). An asymptotic determination of the minimal spanning tree and minimal matching constants in geometric probability. Oper Res. Lett. 9, 223~231.
Boflob;hs, B. (1985). Random Graphs, Academic Press, New York, N.Y.
BollobAs, B. (1986). Combinatorics, Cambridge University Press, New York, N.Y.
Cheriyan, L, and T Hagerup (1989). A randomized max-flow algorithm, Proc. 30th IEEE Foundations of Computer Science, IEEE, pp. 118-123.
Chung, FRX, and R.L. Graham (1981). On Steiner trees for bounded point sets. Geom. Dedicata 11,353-361.
Colbourn, C.L (1987). The Combinatorics of Network Reliability, Oxford University Press, New York, N.Y.
Cox, LTI, and R. Durrett (1988). Limit theorems for the spread of epidemics and forest fires. Stochastic Process. Appl. 30, 2, 171-191.
Dijkstra, E.W (1959). A note on two problems in connexion with graphs. Numer Math. 1, 269-271.
Even, S., and 0. Kariv (1975). An 0(n23) algorithm for maximum matching in general graphs, Amoc. 16th IEEE Symp. Foundations of Computer Science, IEEE, pp. 100~112.
Few, . L. (1955). The shortest path and the shortest road through n points in a region. Mathematika 2,141-144.
Fortuin, C.M., PN. Kasteleyn and J. Ginibre (1971). Correlation inequalities on some partially ordered sets, Commun. Math. Phys. 22, 89-103.
Frieze, A.M., and G.R. Grimmett (1985). The shortest-path problem for graphs with random arc-lengths. Discrete Appl, Math. 10, 57-77.
Gabow, H.N. (1985). Scaling algorithms for network problems. J. Comput. Systems Sci. 31, 148-168.
Goddyn, L. (1990). Quantizers and the worst-case Euclidean traveling salesman problem. J. Comb. Theor, Ser B 50, 65-81.
Goldberg, A.V, and R.E. Tarjan (1988). A new approach to the maximurn-flow problem. J ACM 35, 921-940.
Graharn, R.L. (1983). Applications of the FKG inequality and its relatives, in: A. Bachern, M. Gr6tschel, and B. Korte (eds). Mathematical Programming.. The State of the Art, Born 1982, Springer-Verlag, New York, NY, pp. 115-131.
Grimmett, G.R. (1989). Percolation, Springer-Verlag, New York, N.Y.
Grimmett, G.R., and D.J.A. Welsh (1982). Flow in networks with random capacities S ocha tic 7 205-229.
Hatton, LE, and R. Terada (1982). A fast algorithm for the Euclidean traveling salesman problem, optimal with probability one. SL4M J. Comput. 11, 28-46.
Harris, TE. (1960). A lower bound for the critical probability in a certain percolation process, Proc. Camb. Philos. Soc. 56, 13-20.
Hofri, M. (1987). Probabilistic Analysis of Algorithms.* On Computing Methodologies for Computer
Algorithms FerformanceEvaluation, Springer-Verlag, New � York, N.Y. 8th Symp. on
Jerrum, M., and A. Sinclair (1986). The approximation o the permanent, Proc. 1 Theory of Computing Association for Computing Machinery, pp. 235~243.
Jerrum, M., and A. Sinclair (1989). The approximation of the permanent. SL4M J. Comput. 18, 1149-1178.
Karloff, H.J. (1989). How long can a Euclidean traveling salesman tour be? SL4M J. Disc. Math. 2, 91-99.
Karp, R.M. (1976). The probabilistic analysis of some combinatorial search algorithms, in: J.E Traub (ed.), Algorithms and Complexity.. New Directions and Recent Results, Academic Press, New York, N.Y., pp. 1-19.
Karp, R.M. (1977). Probabilistic analysis of partitioning algorithms for the traveling salesman
problem in the plane. Math. Oper. Res. 2, 209-224. h. 34, 165-201.
Karp, R.M. (1991), An introduction to randomized algorithms. Discrete Appl. Mat
Karp, R.M., J.K. Lenstra, C.J.H. MeDiarmid and A.H.G. Rinnooy Kan (1985). Probabilistic analysis, in: M. 01Eigeartaigh, LK. Lenstra and A.H.G. Rinnooy Kan (eds.), Combinatorial optimization: Annotated Bibliographies, John Wiley and Sons, Chichester, pp. 52-88.
Ch. 6 Probabilistic Networks and Network Algorithms 423

Karp, R.M., and M. Luby (1985). Monte Carlo algorithms for planar multiterminal network reliability, J. Complexity 1, 45~64.
Karp, R.M., R., Motwani and N. Nisan (1993). Probabilistic analysis of network flow algorithms. Math. Open Res. 18, 71-97.
j Karp, R.M., and LM. Steele (1985). Probabilistic analysis of heuristics, in: E.L. Lawler, LK.
Lenstra, A.H.G. Rinnooy Kan and D.B. Shmoys (eds.), The Traveling Salesman Problem: A
Guided Tour of Combinatorial Optimization, John Wiley and Sons, New York, N.Y., pp. 181-206.
Kesten, H. (1980). 'Me critical probability of bond percolation on the square lattice equals .1 2 Commun. Math. Phys. 74, 41-59.
Kulkarni, VG. (1986). Shortest paths in networks with exponentially distributed arc lengths. Networks 16, 255-274.
Lovfisz, L. (1979). On determinants, matchings, and random algorithms, in: L. Budach (ed.), Fundamentals of Computing Theory, Akademia-Verlag, Berlin.
Micali, S., and VY Vazirani (1980). An 0(1V10.51E1) algorithm for finding maximum matchings in general graphs, Proc. 21st YEE Symp. on Foundations of Computer Science, IEEE, pp. 17-27.
Motwani, R. (1989). Expanding graphs and the average-case analysis of algorithms for matchings and related problems, Proc. 2.1st Symp. on Theory of Computing Ass. Comput. Mach., pp. 550561.
Rabin, M.O., and VY Vazirani (1989). Maximum matching in general graphs through randomization. J. Algorithms 10, 557-567.
Rhee, WT, and M. Talagrand (1989). A sharp deviation inequality for the stochastic traveling salesman problem. Ann. Probab. 17, 1-8.
Rinnooy Kan, A.H.G. (1987). Probabilistic analysis of algorithms. Ann. Discrete Math. 31, 365-384.
Russo, L. (1981). On the critical percolation probabilities. Z Wahrscheinlichkeitstheor Verw. Geb. 56, 129-139.
Salowe, LS. (1992). A note on lower bounds for rectilinear Steiner trees. Inf Proc. Lett. 42, 151-152.
Shepp, L.A. (1982). The XYZ conjecture and the FKG inequality. Ann. Probab. 10, 824-827.
Sigal, E.C., A.A.B. Pritsker and J.J. Solberg (1979). The use of cutsets in Monte Carlo analysis of stochastic networks. Math. Comput. Simulat. 21, 376-384.
Sigal, E.C., A.A.B. Pritsker and J.J. Solberg (1980). The stochastic shortest route problem. Open Res. 28, 1122-1130.
Sinclair, A. (1993). Algorithms for Random Generation and Counting.. A Markov Chain Approach, Birkhduser Publishers, Boston, Mass.
Snyder, TL. (1992). Minimal rectilinear Steiner trees in all dimensions. Discr Comp. Geometry 8, 73-92.
Snyder, TL. (1991). Lower bounds for rectilinear Steiner trees in bounded space, Inf Process. Lett. 37, 71~74.
Spencer, J. (1993). The Janson Inequality, in: D. Miklos, VT Sos and T Szonyi (eds.), Combinatmics, Paul Erdbs is Eighty, Vol. 1, Bolyai Mathematical Studies, Keszthely (Hungary), pp. 421-432.
Steele, J.M. (1981a). Subadditive Euclidean functionals and non-linear growth in geometric probability. Ann. Probab. 9, 365-376.
Steele, J.M. (1981b). Complete convergence of short paths and Karp's algorithm for the TSP. Math. Open Res. 6, 374-378.
Steele, J.M. (1990a). Probabilistic and worst-case analyses of classical problems of combinatorial optimization in Euclidean space. Math. Open Res. 15, 749-770.
Steele, J.M. (1990b). Seedlings in the theory of shortest paths, in: J. Grimmett and D. Welsh (eds.), Disorder in Physical Systems: A Volume in Honor of J.M. Rammersley, Cambridge University Press, London, pp. 277-306.
Steele, J.M., and TL. Snyder (1989). Worst-case growth rates of some classical problems of combinatorial optimization. SIAM J. Comput. 18, 278-287.
424 TL SnYder, J.M. Steele

Stougie, L. (1990). Applications of probability theory in combinatorial optimization, Class Notes, University of Amsterdam.
Valiant, L. G. (1979). The complexity of enumeration and reliability problems. SLIM J. Comput 12, 777-788.
Van den Berg, L, and H. Kesten (1985). Inequalities with applications to percolation and reliability. J. Appl. Prob. 22, 556-569.

1

i

1

1