Kepler’s Laws of planetary motion follow very smoothly from Newton’s Law of Gravitation. Very little tough mathematics is needed for the proofs, it can actually be done with ordinary differential calculus and some knowledge on path integration.
Of course, from a historical point of view, these laws appeared reversed. Kepler had neither Newton’s Law at his disposal, nor had he sufficient use of the calculus machinery that we have today. Instead, his way of coming up with his Laws included years of hard work on the astronomical tables compiled by Tycho Brahe; Kepler himself was unable to make astronomical observations himself, even with his self-invented telescope, as he was ill-sighted for all his life. Later, Newton could rely on Kepler’s results to find inspiration for his Law of Gravitation: indeed, unless he could relate his results to Kepler’s laws, he knew his results to be incomplete. Together with his many other achievements (for instance, Kepler was the first to state Simpson’s rule, which is accordingly called “Kepler’s barrel rule” sometimes) this makes him one of the most interesting minds of the early modern era. His interest in planet’s orbits was rooted in astrology and the construction of horoscopes; his observations and deductions led him to drop both Copernicus’ thought that the orbits were circles and, later, that there were no platonic solids involved. Both Copernicus and Kepler had revolutionized the thoughts on space itself, Copernicus by showing how much easier the orbits can be described when Earth is allowed to move itself (no more epicycles and the like), but Kepler made it even easier by dropping circles altogether.
To consider how hard the problem of finding the planet’s orbits is, consider that we have plenty of observational data of the planets, however the data contain two unknowns: the orbit of Earth and the orbit of the planet. On top of that, our observations only tell us about the angle under which the planet is observed, we don’t learn anything about the distances involved (unless we have Kepler’s third law at our disposal). In a very insightful talk for a wider audience, Terry Tao has explained some of Kepler’s ideas on this, especially how Kepler dealt with the orbit of Mars which had been for several reasons the most tricky one of the orbits in the models that preceded Kepler. Tao mentions that Einstein valued the finding of Kepler’s Laws one of the most shining moments in the history of human curiosity. “And if Einstein calls you a genius, you are really doing well.”
From these Laws and from what Newton and his successors achieved, many things can be inferred that are impossible to measure directly. For instance the mass of the Sun and all planets can be computed from here, once the gravitational constant is known (which is tricky to pinpoint, actually). Voltaire is quoted with the sentence, regarding Newton’s achievements but this would fit to Kepler as well, that the insights gained “semblaient n’être pas faites pour l’esprit humain.”
To give just a tiny bit of contrast, we mention that Kepler also had erroneous thoughts that show how deeply he was still rooted in ancient ideas of harmonics and aesthetics. For instance, Kepler tried to prove why the Solar system had exactly six planets (or to rephrase a little more accurately to his thinking: why God had found pleasure in creating exactly six planets). For some time, he believed that the reason was related to the fact that there are exactly five platonic solids which define the structure of the six orbits around the sun. Those were ideas also related to the integer harmonies of a vibrating string, as the planets were supposed to move in a harmonical way themselves. Of course, in these days the observations were limited up to Saturn, as the outer planets (and dwarf planets) cannot be found by eyesight or the telescopes at Kepler’s disposal; all such ideas were doomed to be incomplete. However, his quest for harmonics in the Solar system led him in the end to his Third Law. On another account, Kepler was mistaken in the deduction of his First Law, since he lacked the deep knowledge about integration that would be developed decades later; luckily, his mistake cancels out with another mistake later on: “Es ist schon atemberaubend, wie sich bei Keplers Rechnungen letztlich doch alles fügt.” (“It is stunning how everything in Kepler’s computations adds up in the end”; have a look at Sonar’s highly readable and interesting book on the history of calculus for this).
In what follows, we shall show how the Kepler’s Laws can be proved, assuming Newton’s Law of Gravitation, in a purely mathematical fashion. There will be no heuristics from physics or from astronomy, only the axiomatic mathematical deduction that mostly works without any intuition from the applications (though we will look at motivations for why some definitions are made the way they are).
As a nice aside, we can look at the mathematical descriptions of the conic sections on which the first Law relies. But here again, there’s no connection to why these curve are called this way.
Let us state Kepler’s Laws here first.
Kepler’s First Law of Planetary Motion: Planets orbit in ellipses, with the Sun as one of the foci.
Kepler’s Second Law of Planetary Motion: A planet sweeps out equal areas in equal times.
Kepler’s Third Law of Planetary Motion: The square of the period of an orbit is proportional to the cube of its semi-major axis.
Let us prove this, by following the account of Königsberger’s book on calculus. Many calculus books deal with Kepler’s Laws in a similar axiomatical fashion, yet we stick to this account as it appears to be the neatest one without conjuring up too much of physics.
We shall give a couple of technical lemmas first.
The Triangle-Lemma: The triangle marked by the points , , has area .
Proof: The triangle together with the coordinate axes marks the parallelograms:
with the points , , , ;
with the points , , , ;
with the points , , , .
Thus, the area of the triangle is:
The sign represents the situation given in the figure. For other triangles, another permutation of the signs may be necessary, but there will always be exactly one negative sign. Other permutations of the sign only represent a re-numbering of the points and therefore a change of sign in the determinant given in the statement. As we put absolute values to our statement, we avoid any difficulties of this kind.
As each of the paralellograms has two of their points on the -axis, we find
Lemma (Leibniz’ sector formula): Let be a continuously differentiable path, . Then the line segment from to the points of the path sweeps the area .
Note that we have used Newton’s notation for derivatives. One might also write the integral as .
Proof: Let us clarify first, what we understand by “sweeping” line segments. Consider the path given in the image.
As this path is not closed (it’s not a contour), it doesn’t contain an area. But if we take the origin into account, we can define an area that is related to where the path is:
Now, pick a partition of , such as and make a polygon of the partition and the origin – the corresponding triangles form an area that approximates the area bounded by , as above.
As the partition gets finer, we expect that the polygon-area converges to the -area. And this is where the definition originates, of the area that is swept by a path:
For any there shall be , such that for every partition of that is finer than , we get
Here, is the area of the triangle bounded by , and the origin. By the Triangle-Lemma, its area is .
Because the orientation of the might be of importance, may keep its sign in what follows (Imagine, for instance, a path that traverses a line segment once from left to right and once from right to left; in total, no area is covered).
Now, let us prove that is true.
As and are continuous, choose and take . Take a partition with , and for . Then, for any such ,
This yields, using the mean value theorem,
One might as well prove this by applying Green’s Theorem, but in this case it just gets less elementary.
The -Lemma: Let . We set
Obviously, this is linear both in and in . We have
is orthogonal to and to .
Proof: For :
For : and .
Now, let us look at conic sections and define them mathematically. We will not be interested in what these things have to do with cones – as stated at the beginning: pure mathematics here.
We are going to work in here. Let be a point (the so-called focal point) and be a line (the so-called directrix), and the distance of and shall be some . We are looking for all those points in for which the distance to and the distance to are proportional – formally: For any point , we set
and for we demand
For simplicity, we will put into the origin of our coordinate system, and parallel to one of the axes, as in the figure. In particular, and . Our equation for the interesting points thus becomes:
Let us distinguish the following cases:
Case . Then we set , and find
We see that the interesting points lie on a parabola which is open to the right (by choosing other coordinate systems, of course, any other parabola will appear; in some way, this is its normal form).
Case . Here we set , . Then we get
with and .
We have found that the interesting points lie on an ellipse.
Case . This is exactly the same as the case , except for the last step. We mustn’t set as before, since and we cannot get a real square root of this. Thus, we use and the resulting negative sign is placed in the final equation:
This is a hyperbola.
To conclude this part, we give the general representation of conic sections in polar coordinates. From the figure given above, we see and so
That yields the polar coordinates (only depending on parameters and on the variable :
Now, let us turn to our base for the proofs of Kepler’s Laws: Newton’s Law of Gravitation. Let be the mass of a planet, the mass of the Sun, a real constant (the gravitational constant), and let be a path (the planet’s orbit). By Newton’s Law we have the differential equation
On the left-hand side, there’s the definition of force as mass multiplied by acceleration. On the right-hand side is Newton’s Law stating the gravitational force between Sun and planet.
We define the vector-valued functions of (note that depends on ):
( is the angular momentum, is an axis; but our math doesn’t care for either of those names or intentions).
The –-Lemma: As functions of , and are constant.
Proof: Let us look at first. Using the fact that by definition , and Newton’s Law of Gravitation, we get
Now, for , we will need a side-result first.
Now, we have all ingredients to prove Kepler’s Laws. We conclude axiomatically by assuming that planetary motion is governed by Newton’s Law of Gravitation (the differential equation given above).
Let’s start with
Theorem (Kepler’s First Law of Planetary Motion): Planets orbit in ellipses, with the Sun as one of the foci.
Proof: Let denote the orbit of a planet around the Sun. By the –-Lemma, is constant. By definition of and by of the -Lemma, both and are orthogonal to ; the orbit is therefore located in a two-dimensional plane. Let us introduce polar coordinates in this plane, with the Sun in the origin, and with axis (this works, as is located in the same plane as well: by the definition of ).
Now let be the angle of and , set . This means
By definition of , we have
Now, if , then we have found
which means that the planet moves on a circular orbit.
If , then we conclude
with defined in the obvious fashion to make the last equation work.
Therefore, the planet moves on a conic section, with focus in the Sun. As the planet’s orbits are bounded, we have proved that it must follow an ellipse. Q.e.d.
Theorem (Kepler’s Second Law of Planetary Motion): A planet sweeps out equal areas in equal times.
Proof: We use cartesian coordinates in , such that is parallel to and is parallel to . Then is the plane and the Sun is in . In particular, for all by the proof of the First Law. Then,
By Leibniz’ sector formula, the line segment between times and sweeps the area
This area only depends on the difference of times, as stated. Q.e.d.
Theorem (Kepler’s Third Law of Planetary Motion): The square of the period of an orbit is proportional to the cube of its semi-major axis.
Proof: By Leibniz’ sector formula (used similarly to the proof of the Second Law), the area contained in the planet’s entire orbit is
where is the time taken for a full orbit around the Sun. By the First Law, this orbit is an ellipse, the area of which may be computed as follows: The cartesian coordinates of an ellipse are
with and real constants (the larger one is called the semi-major axis). This is actually an ellipse because of
From the notations about normal forms of conic sections, we find , which implies that (as for any conic section). Now, the area of the ellipse is, by Leibniz’ sector formula again,
Both representations of the area covered by the orbit now yield
and so, using the definition of obtained in the proof of the First Law,
The constant is identical for any planet travelling around the Sun, and thus is constant. Q.e.d.
Let us conclude with a brief remark of how beautiful and elegant those Laws are – made my day.