In the recent months I’ve spent a rather large amount of my spare time with the What, Why and Huh? of Shellsort. Part of the problem was, that I didn’t have much spare time. The other part was that Shellsort is a rather tricky thing to analyse when you look at it.
In the beginning, I was interested in the more standard sorting algorithms like Quicksort and Heapsort. But than I got hooked by the analysis of Shellsort that can be found in Knuth’s book. There are some quite intriguing ideas in it, and there’s still research going on about it – this came as a surprise to me: I had thought that sorting algorithms had been so very thoroughly researched, there wouldn’t be anything interesting left to be found out. But apparently, there’s till some work to do about the optimal parametrization of Shellsort.
Now, let’s start at the beginning. Shellsort is a generalization of Insertion Sort, which is called “bridge player’s sort” in Knuth’s book (Chapter 5.1 in Volume 3). The problem that these algorithms solve is: take the real numbers and sort them. Abstractly, this means: find a permutation on the set such that . Insertion sort does this inductively: if have already been sorted, take and put it in its correct place among by looking at , then at and so on, until you found it.
Insertion sort is a very straightforward method, it’s pretty much what most people do when playing cards (hence Knuth’s nickname for this sorting algorithm). On the other hand, it’s not very efficient. On the average case, you’ll have about swappingoperations, which is a lot. On usual sets, the runtime is too long to be acceptable.
Shellsort improves on those many swaps. Obivously, when a number is “far away” from where it’s supposed to end, you have to check is against many numbers in between. Well, then – just don’t. Let the elements make larger leaps instead of small steps. This means, we will make several successive passes of Insertion Sort, but considering only a subset of elements each time. For instance, we’ll start with the set , then we’ll look at the set , and so on, until . Of course, that will not, in general, sort the set and solve our problem, we have only sorted each subset that consists of the original numbers with skips of numbers in between (we’ll call this “sorted“). So, we make another pass, now with another skipping number . We continue until our original set is sorted. We can guarantee this by making a final pass with skip 1 (which amounts to a usual Insertion Sort).
The last sentence will make it seem that Shellsort is no use at all – in the end, you’ll always need to apply Insertion Sort to achieve the goal, and we’ve already seen that this is not efficient for practical purposes. But then again, the final pass of Insertion Sort won’t have a lot of work to do: those numbers that are “far away” and “very much out of order” will already have been put in a place very close to where they belong. The biggest issue about Insertion Sort has thus been resolved.
Let’s have a short example and let’s be real Knuthish here: We consider 16 random numbers that I have picked using R yesterday morning. First, we’ll do a 7sorting pass on them, where I have indicated those numbers that are compared to one another by similar colors: 155, 352 and 320 are compared and sorted; then 299, 893 and 974 are compared and so on. After that, we have a 5sorting pass, and then a 3sorting pass. Finally, there’s Insertion Sort in the end and we arrive at a perfectly sorted set of 16 numbers.
original set 
155 
299 
923 
41 
152 
484 
145 
352 
893 
472 
291 
44 
664 
354 
320 
974 
7sorting 
155 
299 
923 
41 
152 
484 
145 
352 
893 
472 
291 
44 
664 
354 
320 
974 
155 
299 
472 
41 
44 
484 
145 
320 
893 
923 
291 
152 
664 
354 
352 
974 

5sorting 
155 
299 
472 
41 
44 
484 
145 
320 
893 
923 
291 
152 
664 
354 
352 
974 
155 
145 
320 
41 
44 
291 
152 
472 
354 
352 
484 
299 
664 
893 
923 
974 

3sorting 
155 
145 
320 
41 
44 
291 
152 
472 
354 
352 
484 
299 
664 
893 
923 
974 
41 
44 
291 
152 
145 
299 
155 
472 
320 
352 
484 
354 
664 
893 
923 
974 

1sorting 
41 
44 
291 
152 
145 
299 
155 
472 
320 
352 
484 
354 
664 
893 
923 
974 
41 
44 
145 
152 
155 
291 
299 
320 
352 
354 
472 
484 
664 
893 
923 
974 

sorted set 
41 
44 
145 
152 
155 
291 
299 
320 
352 
354 
472 
484 
664 
893 
923 
974 
Now, let’s have look at the analysis of the algorithm. The nice thing is a Theorem to which Knuth has given the speaking name, well, “K”. If we have made some “pass” to a file, followed by a “pass”, the file will be both sorted and sorted. In particular, the subsequent passes will not render the work useless which we have made in previous passes. We can improve on our set with every next pass. A final pass of “sorting” (i. e. Insertion Sort) concludes our algorithm. The proof of this is not really difficult, but it’s hard to write down, because there’s some muddling with indices.
Then, the question must arise, what are the most efficient choices for the sortingpasses? And this is where research comes in: nobody can tell, what is the best sequence. The algorithm was conceived by Donald Shell (which is why it bears his name, the sorting has nothing to do with beaches or gas stations), and his choice was the sequence , . This choice has been challenged quickly because there will not be any interchange of numbers on odd and even places. The nice thing about it is a divisibility condition: One skipping number is a multiple of the next one, which makes the analysis feasible (although still not easy) – Knuth devotes several pages to sequences like this one. If chosen correctly, one can achieve an averagecase running time of , which is much better than of Straight Insertion. Anything better than cannot be hoped for, because there will be still too many comparisons in the final 1sorting pass.
It will be better if we drop the divisibility condition that mixes up the odd and evennumbered places (or other places respectively, depending on the chosen skipping sequence). The downside is, we can’t say much about the average case for a given skipping sequence, and this is the problem about the analysis: there are many sorted sequences and many sorted sequences in a set of numbers, but their number need not be a divisor of , the number of possible permutations – hence, some permutations appear more often than others.
However, already the sequence can do much better, allowing a worstcase complexity of . This is a huge achievement, because it does not deal with the average case! Besides, the worst case under the divisibility condition will still be .
Let’s have a brief look at sequences with a very special property: Let the sequence be 2sorted and 3sorted already, let’s call this an drowsy sequence (they’ll mostly appear towards the end of the Shellsort). This brings a lot of structure for the final 1pass: For instance, there will be at worst inversions in it. You can see that by thinking about what this sequence must look like: it will start either with 1 or with 21 (anything else will contradict either the 2sort or the 3sort). Let’s use some induction now. For , nothing is to be done. For , the sequence can be 12 (no inversion, assertion holds) or 21 ( inversions, assertion still holds). For any other , the sequence must be constructed
 either from some drowsy sequence by putting in the very end of it. This construction gives no futher inversions, so nothing changes. We have inversions, with inversions, the assertion still holds.
 or from some drowsy sequence by putting in the very end of it. This adds one inversion to those that we already had, so we have inversions, , the assertion holds.
With any other construction the sequence won’t be drowsy anymore. q.e.d.
This inductive construction can be used for some unexpected mathematics as well. For instance, there are only drowsy sequences on numbers, where denotes the th Fibonaccinumber. Each drowsy sequence and each drowsy sequence gives a unique drowsy sequence. By induction, those were and . Digging slightly deeper, one can prove similarly how many inversions all drowsy sequences have in total. A linear function of the Fibonaccinumbers again.
When I found this tiny gem in Knuth’s book, I was honestly surprised about this beautiful connection. Now, that I’ve thought about it long enough, it doesn’t really amaze me anymore. There’s another instance of thinking about something for a couple of days and then saying “oh, of course, it’s trivial!”
The next astonishing idea is due to Vaughan Pratt: Choose the numbers as skipping sequences and arrive at a complexity of ! Such a primitive sequence and what a huge improvement on the worstcase timing! This sequence somewhat chooses the best of both worlds, using many divisors and many coprime skips to bring order to the chaos (actually, those quotations were unintentional…). Sadly, there’s a lot of overhead in this sequence: there are many passes which will do very little and there’s much bookkeepingwork to be done. Thus, for realworld sequences we cannot hope to have any use for Pratt’s sequence in practical applications. But for theoretical purposes, this is a very amazing object to look at. And here’s the icing on the top of all this: the proof is not really hard, not even lengthy (considering my sloppy style of writing here). Let’s see:
First, how many passes will be necessary with Pratt’s sequence? That would be the cardinality of the set . The last inequality defines a set of integers inside of a triangle in the twodimensional plane. This triangle is given by two axes in the plane and the line defined by , where . The points where the line intersects the axes are, of course, and . In the picture below, the integer points of interest are marked, as well as the triangle described so far. The area of the triangle is . The number of integer points inside the triangle is then approximately equal to this number, because each of these integer points corresponds bijectively to a square of unit length (the integer point is the lower left point of each square). You get a little fuss about what happens on the diagonal and about the square at the intersection of the axes, but that’s what the notation is for. No longer caring about this fuss and about what kind of logarithm to use (they only differ by a constant anyway), we find that Pratt’s sequence demands for passes.
Finally, what cost does each pass have? In an pass, where with suitable , the file will already be – and sorted. If , this is extending the definition of “sorted” trivially; if not, this follows from what the algorithm has achieved so far and from Knuth’s “Theorem K”. Now, from this theorem you can conclude that there won’t be too many inversions left in the considered subsequence: In one of our passes, a subsequence of numbers is to be considered (this is repeated times during this stage of the algorithm afterwards before the next skipping number comes in), and by construction, this subsequence is already 2sorted and 3sorted. Thankfully, we already talked about those drowsy sequences. Each has at most inversions in it (its length is here, not as we had above) and we have to consider such sequences during this pass. Hence, the total number of inversions that may be cured by the pass is at most .
So, the costs for Shellsort with Pratt’s sequence is , as stated. q.e.d.
Knuth then goes on to look at bounds that hinge on deeper numbertheoretic relations between the skips. By choosing those cleverly, one can even get down to – even for realworld sets, this may be a suitable choice, since the constant implied by the notation is usually much smaller than with Pratt’s sequence. Pratt’s sequence is way better for astronomically large , though. I admit, I did shun away from the proofs of those bounds… I may return to those some time later. At a first glance, they didn’t seem as appealing as those that I rephrased here. But maybe, I’m just being too picky and not doing justice to those ideas.
As of now, apparently no sequence that achieves the famous bound is known. I would be very interested to learn, if one could actually prove this to be impossible.
Now, for practical purposes, Shellsort will not be your first choice as a sorting algorithm. It has mostly the merits of being very easy to implement, being quite fast on partially sorted sets and being inplace (not using too much extra space while sorting). I have turned to Knuth’s next chapter by now, which deals with exchanging sorts (like Bubblesort and, in particular, Quicksort) – in the beginning I just wanted to skip Shellsort to get to the next chapter faster. But I have dwelled there for quite some time now, and for good reason. Shellsort allows for much deeper mathematics than I had anticipated, and I have learned a great deal about the principles that underly the theory of algorithms. And there is still research to do. Amazing stuff.