I told you on a previous occasion about Lagrange’s theorem about sums of four squares, which says that every positive whole number can be written as the sum of four squares (including as a square). This post is about a generalisation of that idea.
Everybody always seems to say something along the lines of “In 1770 the mathematician Edward Waring asserted that …”. I like to imagine that Waring had just been reading about the theorem of Lagrange I mentioned in the last paragraph and that he started speculating about possible generalisations. Then, in my imagined picture of what happened, he played around a bit with some numbers, and finally made his famous assertion. I’ll tell you now what that is (in a modern phrasing), calling it a theorem because it has now been proved, but for a long time it was unproved and known as Waring’s problem.
Theorem For any , there is some (possibly depending on ) such that every positive integer is the sum of th powers.
For example, Lagrange’s theorem tells us that when we can take : every positive integer is the sum of squares. And the assertion is that something similar is true for cubes, and fourth powers, and so on — but the number of summands is allowed to increase. Waring claimed that every positive integer is the sum of nine cubes, or nineteen fourth powers, “and so on”.
Waring didn’t, as far as we know, offer any proof of his claim. The first person to prove the result was Hilbert, in 1909. A few years later, Hardy and Littlewood gave a new proof. You might wonder why they bothered, given that the result had already been proved, but as I tried to convey in my last post, about Szemerédi’s theorem, there can be very interesting consequences of giving new proofs. In this case, the technique developed by Hardy and Littlewood, now called the Hardy-Littlewood circle method, has turned out to have many other interesting consequences. Moreover, Hardy and Littlewood actually proved a stronger result that gave more information about the structure of the th powers. I’ll try to give you some idea of what they proved and the flavour of how they went about it (because it’s perhaps quite surprising!). I should note that the version of the circle method usually used today is one developed by Vinogradov, who adapted what Hardy and Littlewood did when working on his three primes theorem.
For convenience, I think I’ll just talk about squares here, rather than general th powers, but everything I’ll say can be generalised to higher powers too. So we’d like to know that if is large enough then every positive whole number is the sum of squares. If we can prove that from some point onwards (say ) this is true (every sufficiently large whole number is the sum of squares), then we’ll certainly be done, because if necessary we can make the smaller numbers by adding a lot of times. (This might not give the smallest possible value of that works, but for now we’re just trying to establish that there is some value that works. Of course, Lagrange’s theorem tells us that the best we can take is , but we’re trying to find another proof that there’s some that works, and our aim is to find an argument that will generalise to higher powers.) So let’s take some large positive whole number and try to show that it’s the sum of squares.
That is, we want to know that there’s a solution to the equation . The first step that Hardy and Littlewood took is to try to solve a problem that seems slightly harder. That is, they tried to count the number of solutions to that equation. If we knew how many solutions there are, we could check whether there’s a positive number of them and then we’d be done (if the number of solutions is positive, then there must be at least one solution). It does seem that this is a harder problem (after all, we’re trying to get more information), but it turns out to give us a way into the problem that we might not otherwise have.
So, how can we possibly hope to count the number of solutions? Just before we launch into that, let’s think about whether we really need to count them. Well, no we don’t. We just need to show that the number of solutions is positive (and then it must be at least one). So instead of trying to get an exact expression for the number of solutions, we’ll try to find an asymptotic formula. We’re going to look for an expression of the form
(number of solutions to ) = (main term) + (error term),
where the main term and error term both depend on . If we can show that, for large enough , the main term is both positive and bigger than the error term, then the number of solutions to the equation must be positive. (If we think about the function , for example, I can confidently say that it must be positive for large enough , because for large the first term will be much, much bigger than the second.) How does that help? We’ve now replaced an exact problem (show that an equation has a solution) by an approximate one (estimate the number of solutions to the equation). That said, the pure mathematician in me insists that I emphasise that there’s nothing vague about this: we have to estimate the error term very carefully, to be sure that it really is smaller than the main term!
I’m not quite sure how to describe the next step in a non-technical way. Here’s my plan. I’m going to write down an expression that (it turns out) gives the number of solutions to the equation . Then I’ll try to give some idea of what it means for people who haven’t come across integration. Then I’ll say how you could check that it works (if you know a bit of integration), and finally I’ll say how you could have thought of it yourself (if you know a bit of Fourier analysis). So…here’s the expression:
(number of solutions to ) = .
What does this mean? The symbol means “the integral of the function with respect to the variable “. Roughly speaking, it’s like a sort of special infinite sum, “summing” the values for every real number between and . The remarkable thing (as I see it) is that this expression, which uses calculus and the properties of real numbers, as well as complex numbers, gives the number of solutions to an equation that uses only integers!
If you know a little bit about integration and complex numbers, it’s not too hard to mess around with the above integral and check that it works, using the useful fact that if is an integer then is if and otherwise.
If you know a little bit about Fourier analysis, you can derive the above expression without too much difficulty, using convolution and Fourier inversion. (Start by writing the number of solutions to the equation as an -fold convolution of the indicator function of the squares.)
Great. So we’ve set ourselves a harder task (count the number of solutions), and written down a slightly scary expression. How could that possibly have been a good idea?! Well, it turns out that we can estimate this integral, which is exactly what we were hoping to do. Describing the Hardy-Littlewood circle method as a method feels like a bit of a con to me, because really it can be summed up as “write the thing you want to count as an integral, and then do something cunning to estimate that integral”, and the “something cunning” depends on the problem. But it’s still a very powerful idea.
How can I describe to you how to estimate the integral, bearing in mind that this post is already quite long? I think I’m going to cheat, and not really try. I might well return to this topic in a future post, though, because there are interesting things to say. For now, I’ll just say that it turns out that there are two quite different types of . Some (the “minor arcs”) lead to such a small contribution to the integral that we can ignore them (meaning that their contribution is small enough that it lies in the error term, which we are carefully bounding). To show this, one uses tools such as Weyl’s inequality (which would be an interesting future theorem of the week). The others (the “major arcs”) behave rather differently: they do make a significant contribution to the integral, and it turns out that we can estimate that contribution using the structure of the points in the major arcs. I realise this paragraph has given you essentially no useful information! But perhaps it has whetted your appetite for a future post that says something more.
What else should I say about Waring’s problem? Once we know that for each there is some that works, it is very natural to ask what the smallest such is. This question has now been answered — but I think it’s not quite the right question to be asking. Here’s why. Let’s think about trying to write numbers as sums of fifth powers. How can we write in this way? Well, the first fifth power after is , so we can use only s to write , so we need thirty-one summands. But we need thirty-one summands for a rather boring reason (the lack of small fifth powers). Perhaps if we look at large numbers (e.g. those bigger than 100000000000000), we can get away with using fewer fifth powers, because there are more available. So perhaps a more interesting question is to ask how many summands we need (what the smallest is) if we can exclude finitely many bad numbers: if we are only trying to write sufficiently large numbers. And this question has not been fully answered. When , the right answer is , and it turns out that when , the right answer is . But the exact answer is not known for any other value of ! There’s a list of bounds on the Wikipedia page.
My favourite book on the subject is the snappily titled Analytic methods for Diophantine equations and Diophantine inequalities by Davenport. The other classic that comes to mind is The Hardy-Littlewood method by Vaughan, but there are naturally many other books that discuss the topic too. There’s a Wikipedia page about Waring’s problem, and one about the Hardy-Littlewood circle method.