I’ve written about various theorems in combinatorics so far, but I don’t think any of them have the flavour of today’s theorem.
We’re going to concentrate on the set , which I might sometimes write as (that’s standard notation in combinatorics). This set has lots of subsets, such as , and itself, and (the empty set, which contains no elements at all). How many such subsets?
Well, there are elements, each of which either belongs or does not belong to any given subset. So there are possibilities for whether a given element belongs to a given subset, so subsets in total. Write out the subsets of , say, if you don’t believe me!
How can we think about these subsets? There are two helpful pictures I have in mind; sometimes one is more convenient, sometimes the other.
Here’s the first, which might explain why mathematicians sometimes talk of this object (the family of subsets of ) as the cube. My limitations when it comes to graphics software mean that I’ve written for , the empty set.
Hopefully this picture will make it all clear, and I don’t need to say anything. But there is one thing I want to say. This picture shows a graph (in the combinatorics sense), in which the vertices are the subsets of , and two subsets are joined exactly when they differ by just one element. By positioning the vertices carefully, I end up with a picture that we all recognise as a cube. But if you’re struggling to imagine a 15-dimensional cube, don’t worry about it: we can think about it more abstractly, without needing to draw a picture. (I can draw representations of the 4-, 5- and 6-dimensional cubes if necessary, though — I’ll leave that for you to think about.)
What’s the other way of thinking about this family of subsets? I’m going to call this the layered model (I don’t know of a standard term for it). Each layer contains all the subsets of the same size. So here it is for . I’ve written in what the various subsets are, but normally I’d just draw the blobs without the subsets.
Notice that the top and bottom blobs are the same size because they contain the same number of sets, and similarly the second and third blobs are the same size. In general, the picture would look something like this.
The picture on the left is for even; the picture on the right is for odd. Why the difference? If is even, then the largest layer will correspond to subsets of size (exercise: prove this!). If is odd, then there are two largest layers, corresponding to subsets of size and (that is, and ). Again, I’ll leave this for you to prove.
Now, what sort of questions can we ask about the family of subsets of ?
Here’s one. The family contains some chains. Here’s a linguistic analogy. Have you come across those word puzzles where each answer uses the letters of the previous answer, plus a new letter? For example, the chain of answers might be: A, AT, PAT, TAPS, PASTE, PLATES, STAPLER, PRATTLES (to give an example off the top of my head). A chain in the family of subsets uses the same idea, but adding new elements of each time, rather than new letters. For example, here’s a chain in : the collection , , , , . And here’s another: , , .
Formally, a chain is a collection of subsets where for each pair of subsets and in the chain, either or . So the subsets are nicely nested. Get the idea?
So what’s the question? If you’re like me, you’ll have looked at my chain of words above, and wondered whether we could add another word on the end. (I didn’t try very hard, so it may well be possible.) And could I have made a longer chain, if I’d started in a different place or chosen a different option? In fact, what’s the longest possible chain of words like this (bearing in mind that we have 26 letters to choose from)?
Well, we could ask the same question about chains in . (And since we’re only after subsets, rather than valid English words, it might be an easier problem!) So: what’s the length of the longest chain in ?
It’s pretty easy to see that we can always have a chain of length . I gave an example of such a chain when , and hopefully you can see how this generalises. (Notice that I’m not claiming this is the only example of a long chain: there are lots!)
Can we ever have a longer chain?
Here’s where the layer picture helps us a lot. I’m imagining a chain as a line working from top to bottom (or bottom to top!) through the layers. Could the chain ever have two sets from the same layer? Err…quick think…no. (Do you agree?) So any chain contains at most one subset from each layer. And how many layers are there? Err…quick think…, right? Oh, but that means any chain has length at most , and since we’ve already seen that length is possible, that must be the answer!
That didn’t detain us for very long. (I bet it’s much harder to solve the problem with words!)
Let’s ask a different question, perhaps the opposite question, in a way. What happens if, instead of insisting that every subset contains or is contained in every other subset, we instead say that no subset can contain or be contained in another? (I feel confident that one could generate a very, very long list of words like that, so I’m not even going to bother!) We’ll call the resulting objects antichains. (Some people call them Sperner families, but I prefer `antichain’ because it seems more descriptive.)
Here’s an example in : , , . No subset contains or is contained in any other, right? Could we make that antichain longer? Yes, we could add , but not anything else. (Check this!) But could we have made a longer antichain by starting in a different place? What’s the length of the longest possible antichain? And what’s the length of the longest possible antichain for general ?
Hopefully you’ve now played around with this problem, and perhaps have made some conjectures. (Perhaps you even have proofs!)
It doesn’t take very long to convince ourselves that any single layer (all the subsets of some fixed size) will be an antichain. The biggest layer has size (if is even, or something very close to that if is odd), so that gives us a big antichain.
A bit more fiddling around, and we can’t find anything bigger. The answer from the chain problem was to pick exactly one element from each layer; could it be that the answer to the antichain problem is to pick the whole of one layer (the biggest one if is even, or one of the biggest two if is odd)?
It turns out that this is the case, and that’s this week’s theorem. I don’t know why it’s called Sperner‘s lemma rather than Sperner’s theorem. Wikipedia calls it Sperner’s theorem, but my lecture notes from the course in which I originally came across the result (in a Cambridge Part III course on Combinatorics by Prof. Imre Leader) definitely call it Sperner’s lemma, and so that’s how I think of it.
Theorem (Sperner’s lemma) The biggest antichain in the family of subsets of has size (if is even) and size (if is odd).
(I could have written this as a single statement by writing , meaning the floor function of . I’m going to do this from now on, to avoid writing two cases, but this is really a technical detail.) Here is the usual binomial coefficient, so just means the size of the middle layer.
So how do we prove this? I’m not going to go into the details (because this post is quite long already), but I’ll outline the idea and let you fill in the rest.
Earlier, we saw that every chain of subsets of has length at most . That was because any chain intersects a layer (in the above sense) in at most one place, and there are layers. Rephrasing that in light of our new knowledge, we could say that any chain intersets an antichain in at most one place, and we can partition the family of subsets of into disjoint antichains (namely the layers). Same thing, right?
Could we do something similar here? Could we argue that any antichain intersects a chain in at most one place, and that we can partition the family of subsets of into disjoint chains? If so, we’d have proved Sperner’s lemma, right?
The first part is pretty simple. It’s clear, if you think about it for a moment, that any antichain intersects a chain in at most one place. (If you’re not sure, go back to remind yourself what `chain’ and `antichain’ mean.)
The second part takes a little bit more work, and that’s where I’m not going to go into it. The idea is to show that we can build the chains one step at a time. We start in the top layer, and show that we can take the subset in it and `join’ it to a subset in the layer below (meaning that it contains the subset in the layer below). Then we take each subset in the second layer, and show that we can `join’ each of them to a subset in the layer below (again in such a way that each subset contains the subset below that it’s joined to), in such a way that each subset in the second layer gets a different subset in the third layer. We continue this right down to the middle. Then we do the same from the bottom layer working up to the middle layer, and join them in the middle. So the constraint on the number of resulting chains is the size of the middle layer, as we wanted. I’ve deliberately been a bit vague here. The only key idea that I haven’t mentioned, though, is the use of Hall’s marriage theorem to show that it’s possible to do the `joining’ I’ve described.
We’ve seen an example of an antichain that has the maximum size (the middle layer if is even, and either of the middle layers if is odd). It would be very natural to ask whether there are any others.
It turns out that there aren’t, but note that the proof above doesn’t show this. It’s possible to give another proof of Sperner’s lemma, using the so-called LYM inequality, that does give this information.
This is an example of a result in extremal combinatorics. Extremal because we wanted the lengths of the longest possible chains and antichains. Sometimes it’s possible to classify the extremal examples (as with the antichains — we know what the longest ones are); sometimes that would be harder (as with the chains, where there are lots).
I’ve already given lots of links to Wikipedia. There’s a nice book by Bollobás that discusses Sperner’s lemma and quite a lot else. It’s called Combinatorics, with the subtitle Set systems, hypergraphs, families of vectors, and combinatorial probability.