JOURNAL OF COMBINATORIAL THEORY, Series A Vol. 24, No. 1, January 1978

Existence of Submatrices with All Possible Columns

J. MICHAEL STEELE

Department of Mathematics, University of 'British Columbia, Vancouver, Canada

Let M be a matrix with entries from {], 2,..., s} with n rows such that no matrix M' formed by taking k rows of M has s distinct columns. Let f(k; n, s) be the largest integer for which there is an M with f(k; n, s) distinct columns. It is proved that f(k; n, s) = S'  rlk (i) (S  1)j. This result is related to a conjecture of Erdbs and Szekeres that any set of 2112 + 1 points in R2 contains a set of k points which form a convex polygon.

1. INTRODUCTION

The theorems provided in this note are motivated by questions like the following:

Suppose an n set x,, X2 .... y X. is colored by s colors in m distinct ways. How large need m be to guarantee that there is (1. 1) a k set colored in all possible (i.e., sk) ways?

Suppose that S is a class of subsets of a set X and that {Xl , X2 .... 1 Xn} is an nelement subset of X for which m of the sets A n {xl , X2 x,,}, A e S, are distinct. How large need (1.2)
m be to guarnatee that there is a kelement set {xil , xi . xi,'I C
{xl , x, x,,} for which there are 2k distinct sets
A Xi,,}, A e S?

The first of these questions is new, but the second has been considered previously. It has in fact been solved quite precisely by Sauer [4] in response to a query of Erdiis. An earlier independent solution was given in [51 in connection with a, probabilistic application, but the result of [5] was not the best possible. In Section 2 of this note Theorem 2.1 gives a general result by a new method which implies these earlier results and covers the fresh ground indicated by question (1.1).
The third section gives a geometrical interpretation to a special case of Theorem 2.1, and shows the relationship of the present work to a longstanding conjecture of Erd6s and Szekeres (see [1, p. xxi]).
84
00973165178102410084\$02.0010 Copyright C 1978 by Academic Press, Inc. All rights of reproduction in any form reserved.

EXISTENCE OF SUBMATRICES 85

2. MAIN RESULTS

Let M be a matrix with entries from an ssymbol alphabet {1, 2,..., s}. Now let f (k; n, s) be the largest integer such that there is a matrix M with n rows and f (k; n, s) distinct columns such that no matrix M' formed by taking k of the rows of M has Sk distinct columns.
To note the relationship of f (k; n, s) to question (1. 1) one defines a correspondence between matrices and sets of colorings as follows: M (aij), where aij ~ b and b is the color of xi in the jth coloring of {x, , X2 3 Xn}. For any subset of elements {xi,,xi,,...,xi,,}C{x,,x,,...,xn} there is a corresponding subset of k rows of M which forms a submatrix M'. Further, since any coloring of {xj,~, Xi, Xi,} corresponds to a column of M, the number of distinct colorings of {xi, , xi. xi k } equals the number of distinct columns of M'. In the notation of (1. 1) we therefore have m f (k; n, s) +
The main result can now be stated quite succinctly.

THEoREm 2. 1.

n (n
f(k; n, s) = s'n  1 )(S (2.1)
jk

Proof. First it will be shown by construction that f (k; n, s)
Sn  yn k (n)(S  1)nj, and then the opposite inequality will be proved
j j afterward by relating the general case to the first construction.
Define M to be the matrix consisting of all columns such that no column contains k or more ones. Since Eilk (j)(S  1)nJ is precisely the number of
'n
columns with k or more ones, we see that M has Sn  Y,k (,D(S  Wj
columns. But since no krow submatrix of M contains the column of all
1)ni
ones we have f(k; n, s) >, S"  YInk CD(S
To obtain the opposite inequality we suppose that a matrix M has no krow submatrix with Sk columns. Todescribe the columns which are missing from M, let C,, C2,..., C, where (n,) . 7. be a list of the kelement subsets
of the row indices. For each i = 1, 2 . r there is a submatrix Mi formed by
the Cj rows of M. Also by the hypothesis there is a kvector vi which is not
a column of Mj. Now for each such vi let Zi be the set of columns of the
n x sn matrix which equal vi when restricted to the index set Cj . Finally
observe that none of the columns of Z = UT , Zi is a column of M.
If v denotes the number of columns of Mthen V i= Uri1 Zi).
The proof will be completed by obtaining a lower bound on 1 UT 1 Zi 1.
To do this we define a function on column vectors W = (W, W21 ... 1 Wj as
follows:

0(w) W, where w' (W,', W2 Wn)
(2.2)

86

and

7 ' :C~ 7
~T 1E
~1  EnLE

~ ~.

c Z~ ~and i c 7, 0 :0In

The, funetion has
Jfl
, rst note and t_hell prove

1

i=1

oK1
r, centains all of the ~2 xl
1strieLed ~o G, equal the Jector (1,

,n
1 <
P(Z) 1 >
(S

+4

The proof of (2.4) is immediate since 0 is a function, and (2.5) is just a consequence of (2.3). To prove (2.6) note that if w has k or more ones, then there is a Cj , restricted to to which w has all ones, and hence w e O(Zi), by (2.3) and the definition of Zi. Finally (2.7) comes from (2.6) and easy
counting.
The last calculation is that

n
v < S  1 Z 1 OW1 < sn (S  WJ1 (2.8)
j k i

which completes the proof.
The preceding method also permits a precise understanding of those extreme matrices which lack krow submatrices with a complete column set. Such matrices are characterized by a "missing" column vector.

THEOREm 2.2. Suppose M is an nrow matrix with s"  I", (~)(s  1)i distinct columns and which has no krow submairix with s11 distinct columns. Then there is an n vector v such that for each column w of M one has wi 0 vi for at least k values of the index 1.

Proof. In the notation of the previous proof, we note that if there is no v as required above then there are vi and vj such that Cj n Cj =~= o yet vi and vj are not equal on Cj n C,. By the definition of 0 and Zi we therefore have 1 O(Zi u Z) I < 1 Zi u Z, J. Consequently, we have 1 O(Z) I < 1 Z 1.
But, since M has Sn YInk (nIXS' 1)1 distinct columns, we note j Z 1 = yn
_ik CjXS 1)n However, by (2.7) we know 1 0(Z)l >,

in
1k (7)(S  1)n~ so the inequality 1 0(Z)l < 1 Z 1 yields a contradiction.

1

87

,c TO A,

one can
This
'r)n
'd o' 19I~ 1933, pLibiisi'ie~i in 1935, promulgated daily,
n s o I ve f.`1 i& k ~~ ~E H, pp. 42; 2; 3j.
IC, of Section 2 are relevant to this conjecture of Erdos and
since they provide a sufficient condition that a set contain a convex

To sce this let X_ be the plane and S the class of convex subsets of 2'.
n A; A c SJI 1
{{Xl ' ~12 1, XW

that 'S, A (Xl 1 X2 1 ... 1 'X,,) is the number of subsets {xi,, xi~ xiJ C
{Xl 1 X2 xn} such that {xl xi . 1VitJ = {Xl 1 X2 XJ n A for some
A c S. Let A~ , j = 1, 2,...1 A(Xl , X2 ... 5 X.), be elements of S such
that each of the sets {x, , X2 Xn} 0 Aj is distinct. These Aj define a
n X A(xl , x. x.) matrix as follows:

aij = 1 if xi c Aj (3.2)
= 0 if xioAj.

By the definition of the Aj we know that M = (ai) has A (Xl 1 X2 ...I X.) distinct columns so

A (Xl , X2 Xn) < f (k; n, 2) (3.3)

unless M has k rows which have 24' distinct columns. But since A(xi 1 ' xi2 Y ... 5 xJ = 2k if and only if the set {xi. ' X2 xi,,} forms a convex polyhedron, we have proved the following:

THEOREm 3.1. A sufficient condition that the set {Xl X, Xn} C R'
contains k points which form a convex polygon is that
k1
(Xl X2 Xn) > 1 n (3.4)
j0

To prove the ErdbsSzekeres conjecture it thus suffices to show that '(3.4) holds when n = 2 k2 + 1. Of course, condition (3.5) has only been proved sufficient and quite possibly the ErMsSzekeres conjecture can be true without (3.4) being met. Still, there are several possible uses of A (xl , x2 x.) in this problem and (3.4) pinpoints the most direct one.
To gain another view of Theorem 3.1 one should note that it is possible
to give a more geometrical proof which av ' oids invoking the full strength
of Theorem 2. 1. For this proof, suppose B c {{Xl , X2 XJ n A: A c S}

88 J. MICHAEL STEELE

and let OB denote the subset of B equal to the elements of B on the boundary of the convex hull of B. We note that 1 aB 1 <_ k  1 if {x, , x, x,,} contains no kelement convex polygon, since, indeed, aB is convex polygon.

yk1 (~)
Next note that there are precisely j=0 J subsets of {x, , X2 XJ with
fewer than k elements. Since aB uniquely determines B we have
k1 (n)
A(xl 1 X21 1 Xj < ~', (3.5)
j=0 j

unless {xl, x,,..., x.,,} contains a kelement subset which forms a convex
polygon. This completes a second proof of Theorem 3.1.

4. A CLOSELY RELATED PROBLEM

In connection with the results given here and the ErdbsSzekeres con
jecture the following question seems quite interesting:

What is the minimum value Of A(Xl 1 X21 ... 1 x,,) given that
{X,, x,,..., xj contains a kset which forms a convex polygon ? (4.1) (The xi are assumed noncolinear.)

If this value is called g(n, k), it is trivial that g(n, k) >, 211, but a sub
stantial improvement on this seems difficult. Still, by consideration of this problem it may be possible to make progress of the yet unreachable con.. jecture of Erd6s and Szekeres.

REFERENCES

1. P. ERD6s, "The Art of Counting" (J. Spencer, Ed.), MITTress, Cambridge, Mass., 1973.
2. P. ERD68 AND G. SZEKERES, A combinatorial problem in geometry, Compositio Math. 2
(1935),463470.
3. P. ERD6S AND G. SWKERES, On some extremurn problems in elementary geometry, Ann.
Univ. Scl. Budapest. E45tvjs. Sect. Math. 34 (19601961), 5362.
4. N., SAUER, On the density of families of sets, J. C ombinational Theory A 13 (1972),
145147.
5. V. N. VAPNIK AND A. YA. CHERVONENKIS, on the uniform convergence of relative
frequencies of events to their probabilities, Theor. Probability Appl. 16 (1971), 264280.

Printed by the St Catherine Press Ltd., Tempelhof 37, Bruges, Belgium.