0 1 j_

J. Appl. Frob. 16D, 657661 (1979)

Printed in Israel

C Applied Probability Trust 1979

DAVID W. BOYD* AND

J. MICHAEL STEELE,** The University of British Columbia

Abstract

Suppose that n persons each know a different piece of information. and that whenever a pair of persons talk on the telephone each tells the other all the information that he knows at the time. If calls are made at random, we show that the expected number of calls necessary until everyone knows all n pieces of information is asymptotically 1.5 n log n. This sharpens an earlier result of J. W. Moon.

CONTAGION; RANDOM MATRIX; TELEPHONE PROBLEM

1

i

1

1

i

1

1

i

1

i

i

i

i

i

i

1

1. Introduction

The object of this paper is to give a solution to a problem raised by Moon [41 concerning a model of random contagion. Of the several ways to describe this model one has become traditional. Suppose that n persons.each know a different piece of information, and whenever two of them talk on the telephone each tells the other all the information he knows at the time. The way in which information spreads through such a system suggests a variety of problems.

The popular 'telephone problem', attributed to A. V. Boyd in [41 and [51, is to determine the minimum number of calls c(n) required until everyone has learned all n pieces of information. It is clear that c (1) = 0, c (2) = 1 and c (3) = 3, and A. V. Boyd exhibited a scheme of calls to show that if n 4, c (n) :5 2 n 4 (such a scheme is described in [5]). He conjectured that c (n) 2n 4 for n 4, and this was proved independently by Tijdernan [51, Bumby and Spencer [11, and Hajnal, Milner and Szemeredi [31.

Moon [41 considered instead the situation in which the calls are made at random (i.e. the operator chooses two different persons at random from the n and places the call). Suppose C calls are made before each person knows all the

Received 11 May 1978; revision received 25 September 1978.

* Postal address: Department of Mathematics, The University of British Columbia, 2075 Wesbrook Mall, Vancouver, B.C. Canada V6T IW5.

**Present address: Department of Statistics, Stanford University, Stanford, CA 94305, U.S.A. Research supported in part by the National Research Council of Canada.

657

i

i i i i

1

~1

i

i

1

i i

j 1

i

658

DAVID W. BOYD AND J. MICHAEL STEELE

information. The problem is to determine E (C), the expected value of C. Moon gave an elegant proof of the bounds

(1 E)n log n E(C):z5 (2 + E)n(log n )2,

where e > 0 and n is sufficiently large. Here we will prove the following more precise result.

Theorem.

3

E (C) n log n + 0 (n (log

2

A different formulation of the telephone problem, attributed to Wirsing in [51, may serve to put our result in a more general context. Let Bii be the n X n matrix with all entries 0 except for a 1 in the (i, j) and (j, i) positions. Let A (0) = I and A (t) = A (t, 1) (1 + Bjj) if there is a call between persons i and j at time t. Then ak. (t) > 0 if and only if the m th person knows the k th piece of information at time t. One sees this by induction, after observing that A (t) is obtained from A (t 1) by replacing the i th and j th columns by their sum, leaving all other columns unchanged. Thus, if m = i or j, then a,, (t) > 0 if and only if at least one of ake (t 1) > 0 or aki(t 1) > 0 holds, while if m 3' i, j then a,. (t) = ak (t 1). This corresponds to the fact that i and j know the kth piece of information at time t if and only if one of them knew it at time t 1. Thus, if the Bii are chosen at random, C is the first time at which A (t) is a strictly positive matrix.

i

1

*1

i

1

1

2. The lower bound

Let Ti be the first time at which everyone knows person i's original information, and let T~, be the first time at which person i knows everyone's information. In terms of the matrix formulation, Ti is the first time that the ith row of A(t) has.no zeros, and T~, is the first time that the ith column of A (t) has no zeros. Since each Bii is symmetric, it is clear that one has the following result.

Lemma. The distributions of the vectors (T,,.. ., T.) and (T*I, .. ., T*.) are

equal.

By definition, C = maxisis, Ti = maxl,,%, Tt. Moon used the observation that E(C) ~t E(T~) to obtain his lower bound. To improve this bound, it seems necessary to consider all the Ti. Let T*j) be the ith largest of the TI, so that T*oS ... :5 T*.) = C. Now let Ri be the number of calls until each of the persons 1, 2, i has received a call. Then, for i :_5 n 2

(1)

E(C) =,t E(T*0) + E(R~i~,).

This follows from the fact that, at time T*j) either i or i + 1 persons know

1

i

i

1 1

1

A !

1

~ 1

1

Random exchanges of information

659

everything (since a call involves two callers). For the remaining n i 1 persons to learn everything, each must receive at least one call. If one defines E(R,,) = E(R,) = 0, then (1) is obviously valid for i = n 1 and i = n. Averaging (1) over i, we have

(2)

But, from the lemma,

(3)

Combining (2) and (3) gives

(4)

2

E (C) n E (T'(j)) + n E (Rj).

E (T'i)) E (T E (T, nE (T,).

E (C) =I E (TI) + n E (Rj).

As in [41, E (TI) can be calculated by observing that T, = X, + . . + X_, where Xi is the number of calls after i persons know person I's information until i + 1 know it. The Xi are independent and geometrically distributed with parameter pi = (2i(n i))1(n(n 1)). Hence E (X) = 11pi and

n1

(5) E (T,) = 1 p

i=l

1 =

1

n111 1+ 1 = n log n + 0(n).

2 ifli nii

The calculation of E(Rj) is slightly more complicated. We observe that the process of generating calls at random is equivalent to the following: first generate numbers NI, N2, ' ' . at random from {1, 2,. . ., n}. Form the sequence of pairs (N,, N2),(N,,N,,),.... Delete from this sequence any pair for which N2k 1 = N2k, obtaining a sequence (N'i, N2'),. . .. Then a call is made at time t between N2', and N2',. Thus Ri is the time at which M,,.. , N2', first contains all of {I,.. , i}. Let S be the first time at which.N,,.. N5 contains all of fl,. i}

and M be the first time at which N N2,, contains Clearly

M ~_~ S/2. Next let Ak be 1 if N211= N2k and 0 otherwise. Put K = Y.'k,, Ak and

note that Ri M K Since E (Ak ) =1/n, Wald's lemma ([21, p. 380), shows

E (K)= E (M)l n, so

(6)

E(Rj) (1 n')E(M) (1 n')E(S)12.

But S = B, + . . . + Bi where B1 is the time between the occurrence of the (i 1)th and j th elements of {1, 2, . ., i} in NI, . . ., Ns. SO B, is distributed geometrically with parameter (ij+1)1n. Hence E(Bi)=nl(ij+l). So E(S) = n j`. Combining this with (6) gives

(7)

n~'j E (Rj) ~: 1 n log n + 0(n),

i=l 2

i

i

1

i

660 DAVID W. BOYD AND J. MICHAEL STEELE

and thus we have, combining (4), (5) and (7),

(8) E (C) 3 n log n + 0(n).

2

We first note that

P(C U) = P max T, U ;9 nP(T, _~ u).

Now,, using the wellknown formula E(C) = f,, P(C u)du ([21, p. 148), we see that, for any t > 0,

E (C) P(C =:~ u)du + P(C ~ u)du

(9) t + n L P(T~ ~ u)du.

The exact distribution of T, is known from Section 2, and thus

n1

(10) P(T~ ~_~ u):5 eE(e e pie'1(1 qie'),

where pi (2i (n i))/(n (n 1)), and qi = 1 pi. In (10) we must have 1 q,e' > 0 so e> 1 2/n. Write 1r. = 11i .1'pie'1(1 qie') and then substitute (10) into (9) to obtain

(11) E (C):_5 t + ns'e"ir

Setting e 1 c /n with c < 2, we have

n1 1

(12) log ir~ log (1 cl(npi)) jl(cl(npi))j.

j1

If we set bj = Yi~1'(cl(np,))j, we have (as in [51), b,:5 c log n, and

n 1 1 2

n1 n1

jbj < (c 12)i ) (1 c /2)

j2 i1 j2 i(n i) 2 i(n i)

since (n 1)ffi(n i)):_5 1 for 1 :_5 i n 1. Now

)2 = 1

n 1 _L +

E i2 + + i)2} 0(1),

n i n i n i (n

so that, by (12), there is a constant M for which

log 1r. :_5 c log n + M(I. c /2).

Random exchanges of information 661

Since s > 1 e = c In, (11) now becomes

(13) E (C):_5 t +c `ex p {2 log n n'ct +clog n + M(l c 12)`}.

On setting c = 2(1 (log n )112 ) and,t = `2n log n + A n(log n)"', the exponent in

(13) becomes log n (2A 1 M) (log n)` + 2A = log n + 2A, if we choose A = (M + 1)/2. Then (13) becomes

E(C):5 ~ n log n + An (log n)` + c`n exp (2A) 2

3 n log n + 0(n(log n )112), 2 which completes the proof of the theorem.

References

[11 BUMBY, R. T. AND SPENCER, J. Unpublished paper cited in [2].

[2] FELLER, W. (1966) An Introduction to Probability Theory and its Applications. Wiley, New York.

[3] HAMAL, A., MiLNER, E. C. AND SUMERED1, E. (1972) A cure for the telephone disease. Canad. Math. Bull. 15, 447450.

(4] MOON, J. W. (1972) Random exchanges of information. Nieuw Arch. Wisk. 20, 246249.

(51 TUDEMAN, R. (1971) On a telephone problem. Nieuw Arch. Wisk. 19, 188192.