The cycloid was a core object of mathematical studies during the development of calculus and, before that, for geometry. It arises as a special case of curves in astronomy and it has been used as a challenge for competitors during the rise of the analytic method.
Let us have a look at its definition first, and what it describes heuristically. The cycloid is the planar curve parametized via
It is the curve of a point on the periphery of a circle that is rolled along the -axis. The parametization follows like this:
The circle rolls along the -axis with constant speed, therefore the angle . As , we get . Now, the -component of is
For the -component of , , and thus
The cycloid can be considered as a special case of the epicycloid. Those have been a matter of interest in the pre-Kepler era, when astronomers tried to explain the motion of the planets in the night sky. As they considered perfect circles as orbits only (as opposed to the ellipses that they actually are), and as they postulated the Earth to be in the center of all those orbits, it was tricky to explain away the observations of different arc speeds and loops that the planets sometimes take. The solution was to imagine the planets circling around Earth, but on this circle was the center of another circle, on which the planets moved. Thus, the planets travelled on an epicycle; and sometimes one of those wasn’t enough (“salvation of the phenomena”).
A first simplification of this theory came from Copernicus who dropped the assumption that the Earth would be in the center of all things, but who didn’t get rid of the perfect circles. For the final resolution, the world had to wait for Kepler and Tycho Brahe. But anyway, that’s not why we’re here.
The cycloid is the curve, on which a point of mass will travel the quickest, if it just rolls along it, drawn by gravitation only (if friction is disregarded); in Greek, this is called the “brachistochrone“. It is remarkable that this quickest path is not the shortest path – there are quicker ways than a straight line. There is some sort of trade-off between gaining speed quickly and between keeping the path sufficiently short. We will look at two different approaches to prove that the cycloid can do this trick.
Another remarkable property of the cycloid is being the “tautochrone“: if points of mass are placed anywhere on this curve, they will travel to another point on this curve in exactly the same amount of time. Points that are farther away will gain more speed in order to close the distance. This is a highly interesting property for building a pendulum: no matter how big the amplitude, the frequency will always be the same. This, in turn, is the core feature of an exact clock, which was a sort of holy grail for scientists to find during the 17th century (not just for ship navigation). This property has been found by Huygens, who had not been able to use calculus methods for this (his solution is hidden in quite cumbersome geometry).
More on this curve and some very nice experiments may be seen in this youtube-video from the highly interesting channel vsauce. I especially love the excitement of both guys when they actually see these properties of the cycloid curve in action.
The brachistochrone problem was posed by Johann Bernoulli in a journal as a quest for the most enlightened mathematicians of the world (“acutissimis mathematicis qui toto orbe florent“). We will see his very elegant approach right below. His brother Jacob found a more general approach, but his train of thought is much more cumbersome – we will see a modernized simplification of this later. Both brothers engaged in a non-friendly competition by posing problems like this one to each other, always hoping for each other’s errors to gloat over. In retrospect, both of them advanced the applications of calculus when it was conceived; note however, that very many of the things that are named after Bernoulli (Bernoulli numbers, Bernoulli distribution, the Law of Large Numbers) have come from Jacob, not from Johann. But the other enlightened mathematicians of the time also retrieved the solution, particularly Leibniz and Newton who both are said to have found the solution in a matter of few hours, and both of them appreciating the beauty of the problem.
Now, let’s see how Johann came to his solution. We will look at some physical properties first.
The Speed Lemma: Consider a point of mass that travels without friction along any sort of curve in , the only force on it being the gravitation. Let be Newton’s gravitational constant. Then, when it has travelled height , its speed is .
Proof: As physics tells us, the sum of kinetic and potential energy is constant. One may prove this mathematically by doing very basic integration and thinking of Newton’s second axiom (the one with force, mass and acceleration); we won’t go into this. Now, the kinetic energy is (for physicists that’s the definition, for mathematicians that’s an easy lemma), while the potential energy is . Our zero-level for the potential energy is set such that the potential energy vanishes, when the point of mass has travelled height . By our set-up, the point of mass has no speed in the beginning and hence no kinetic energy. We have found
which was to be shown. q.e.d.
One might wonder if there is some problem here, that the speed formula does not depend on the kind of curve that the point of mass moves on. Indeed, without friction there is no problem. One can argue in an entirely different way about decomposition of the gravitational force in a force directed along the (derivative of the) curve and a normal force orthogonal to this one. This decomposed force is of course smaller than the gravitation and hence brings less acceleration to our point. In turn, one can compute the time it takes the point to travel to height , and the speed that it has gained by then. As physics is consistent in itself (surprise!), we arrive at the same result that we gained via kinetic and potential energy. Not being a physicist, I can’t tell with certainty if this connection just stems from a little proof that I didn’t see, or if this is some sort of recognition that the world actually behaves responsibly and rationally. I won’t even start to question this here.
The Time Lemma: Consider the same setting as in the Speed Lemma. On top of that, let the curve on which our point travels be given by a differentiable function . Let the point travel from to . The time it takes for this is
Proof (by a little hand-waving): Consider any point on the curve, with . The infinitesimal time our point of mass spends in is
Taking a leap of faith and integrating this (which is supposed to amount to the sum of all such infinitesimal times) gives
In a post on physical interprations of mathematics, a little physical computation can’t be too wrong now, can it. q.e.d.
The Reflection Principle: Consider a ray of light travelling in from point to point , being reflected somewhere on the -axis. The resulting angles of reflection and are equal.
Proof: The underlying physical principle is to choose the line of minimal length for the reflection. A mathematician would put this as an axiom, a physicist will consider this granted by the way that nature behaves. Let’s go with it: the length of the chosen path is, as long as the ray of light is reflected in the point ,
hence we look for some with
As, for obvious reasons, this is the assertion. q.e.d.
The Refraction Lemma: Consider a ray of light changing the medium in which it travels. Let the speeds of light in those media be and . The resulting angles of refraction have a constant proportion:
Proof: Now, the speed of light gets relevant and the physical principle is to find the path of minimal time. By the basic laws on time and speed we get
and we look for some with
The lemma is proved. q.e.d.
We have all ingredients to follow Johann Bernoulli’s idea to find the brachistochrone now. The basic question is, what is the quickest path for a point of mass to take, if it is to travel from one point in the plane, say, to another one ? Johann’s ingenious idea was to compare this to the path that a ray of light will take – as we have postulated, the ray of light will choose the quickest path as well. The acceleration may stem from gravitation or the path may result from the change of the media, but the aim is the same; as Bernoulli wrote: “who would deny us to replace one approach by the other?”
Hence, let us consider a “continous” change of media, for instance by making a limit of finer layers of media for the ray of light to traverse. As the Refraction Lemma showed, we will get a constant quotient of . By the Speed Lemma, our point of mass has gained , if it has arrived at level only being accelerated by gravitation.
Now, using the designations of the following picture (note that ),
As is constant, we find the differential equation
By setting and by separation of variables,
Then, we substitute , yielding ,
This integral can be readily solved via partial integration:
Altogether we have found (note that we do not re-substitute for , since we are not interested in a parametrization like )
As we can set our coordinates such that (the point will begin its voyage in ), we get . This shows
Setting and , we retrieve the standard parametrization of the cycloid:
The brachistochrone must be a cycloid.
But now for a completely different approach. The brachistochrone can also be found via calculus of variations, which is considerably harder, from a technical point of view, than what we did above. On the other hand, these techniques can be applied to a much broader spectrum of problems. We can only sketch many of the issues here.
Historically, the brachistochrone problem has been the start to calculus of variations. Jacob Bernoulli solved the problem with methods like this, much more general but much less elegant than his brother.
At the core is the observation that we wish to minimize a functional
over the set . Is there some such that for all ?
We consider the function to be defined as . The inputs and will play the roles of the solution function and its derivative, respectively.
Notice that we restrict ourselves already to smooth functions . From a physical point of view, there is no reason why the brachistochrone shouldn’t be just continuous. However, tougher mathematics would be necessary to track down this one.
If the space is well-behaved, usual compactness arguments tell us that there is a minimum. But it is much harder to pinpoint.
Theorem (Euler-Lagrange; tiny special case): A necessary condition for a -function to be a solution to the minimization problem is
In its expanded form, this is (dropping the arguments for reasons of better legibility)
Proof: Let be the minimum and let with . We then consider
Since we chose everything to be well-behaved, will be differentiable. As minimizes the functional , , and hence . Note that the derivative is a here. For the function , the derivative means .
Now, let us compute this (calculemus!)
Integration by parts yields, together with the fact that ,
This expression must vanish, as we demand , if is supposed to be a solution to the minimization problem. We have an arbitrary function involved, so the expression in brackets will have to vanish entirely. Formally, one can see this by contradiction: if in some point , the bracket-expression did not vanish, we could choose some interval where this bracket-expression didn’t vanish at all (it is continuous, after all). On this interval, we set , we find the integrand strictly positive there, and vanishing outside. Contradiction to .
For , the statement follows. We have thus proved the Euler-Lagrange equation in this particular case. q.e.d.
Notice that we didn’t speak about sufficient conditions. That would overstretch this text by far – let’s ignore this.
The Simplification Lemma: In the special case, when only depends on and , and not directly on its first argument , the Euler-Lagrange equation will simplify to the condition
Proof: This follows by a straight-forward computation:
We have used the expanded form of the Euler-Lagrange equation in together with the chain-rule and the feature that in the present special case , and the chain-rule all by itself in . All over the place, we have used that is a solution to the Euler-Lagrange equation and thus needs to be plugged into . q.e.d.
Now that we have the ingredients, let’s try and find the brachistochrone by calculus of variations. By the Time Lemma, we want to minimize the expression
By the Simplification Lemma, any solution will have
will be a solution depending on . On the other hand, we look for a parametrization of a curve in , hence we try to find both functions and , that are connected via . We set, by divine insight,
The chain rule then says , and hence
We already have almost integrated this one before in , the substitution yields
This shows, that any solution to the minimization problem must look like
and is hence a cycloid. What we haven’t proved is, that it actually is a solution to the minimization problem – we didn’t speak about the sufficient condition with Euler-Lagrange, not about regularity of our set and only about -functions in the first place (I won’t even go into the physical hand-waving). But anyway, the little tricks and the big machinery of technique make both approaches really insightful and interesting. This makes it a good place to end.