Sorting items in a linked list
Purpose
This document describes a sorting algorithm that is very efficient and well suited for single
linked lists. It is a stable merging sorting algorithm with O(N.log(N)) worst case performance
and O(N) best case performance.
Introduction: linked lists
If you have a set of objects within a program you can store them in a few different ways.
Arrays are the most common way to store a set of objects but other data structures are also
possible, like linked lists and trees, each having their own advantages/disadvantages.
Each object needs to have a Next property that points to the next object in line.
The list can be passed on by passing the first item of the list. The second item can be found
via the first item via Next, the third via the Next
property of the second item etc etc. The end of the list is reached if the Next
property is unassigned. This is called a single linked list. If each object also has a
Previous property it is called a double linked list.
Figure 1. A sample single linked list.
Insertion in a linked list is extremely fast and no array or whatever is needed to keep the list.
Traversing the list is easy but a linked list has no fast random access: get item number 123.
If the list is a double linked list, deletions are very fast too.
Introduction: Sorting algorithms
There are numerous different sorting algorithms.
Most sorting algorithms describe sorting in an array. They tend to sort items
by comparing two items and swapping their places (or not). Some are more efficient
than others. For example the simple bubble sort algorithm tends to become four times
slower when the number of items doubles in the array. The quicksort algorithm does better,
but can exhibit the same behavior on some specific inputs.
The efficiency of a sorting algorithm is usually given by the expected and worst case
amount of compares needed to sort N items. For bubblesort the expected amount of compares is
O(N.N), for quicksort it is O(N.log(N)). Especially if N is large quicksort
becomes faster and faster compared to Bubblesort since log(N) is smaller than N if N > 1.
Quicksort, in worst case, also needs O(N.N) compares and swaps.
Most algorithms are not well suited for single linked lists. They assume that you have full random
access or forward and backward access. In a single linked list you only have access to the first
item and from there you only can traverse forward, scanning the items one by one;
no random access whatsoever.
Merge sort
A merging sort algorithm merges small sorted lists to larger sorted lists.
Merging two sorted lists is quite a simple algorithm: as long as both lists contain items,
compare the top of each list with each other and place the smallest item on the bottom of
another list. If one list is exhausted, the new list can be placed on top that one and we end
up with a single sorted list.
The algorithm starts with N 'lists' each containing a single item. Single item lists are always
sorted by nature. The first pass merges those N lists to N / 2 lists of length 2.
The next pass merges it to N / 4 lists of length 4. After log(N) merge passes we end
up with a single merged list of all N items. In each pass the algorithm must do N compares max.
So the algorithm must do O(N.log(N)) on average and worst case.
Here the process is depicted in a figure:
Figure 2. Merging four 3 item lists into one.
Keeping a stack of merged lists
If you use the algorithm as described above you need to keep track of N lists max. This means
you need quite an array for bookkeeping, or another reference in each object. There is a simple
trick that relieves this bookkeeping to log(N) + 1 lists max.
If we keep a small array of lists, the stack, with the first slot can contain only lists of
length 1. The second slot can only contain a list of length 2, the third slot can only contain
lists of length 4 etc. If we want to place a list of 1 item in the first slot, we succeed if it
was empty or, if do a merge with the list in the first slot and empty the first slot. We now
have a list of size 2 and can do the same with the second slot etc, until we reach a slot
without a list. Since each slot in the small array keeps the double amount of items of the
previous slot we only need log(N) + 1 slots in the array. A general maximum of 32
would be enough for 32 bit processors.
After all items are added, we end up with 1 or more slots being occupied by lists. We first scan
for the first list we can find and keep on scanning the whole array and propagate that list to
the next item each step, doing a merge if that slot was occupied. We end with the full sorted
list in the last slot of the array.
It makes the algorithm even faster. Since we access recently accessed items more often the
caches of modern processors can do a better job and enhance the performance even further.
Figure 3. Listsizes while merge sorting 5 items with a stack.
Using natural order
Instead of creating single item lists to start with, we can use the natural order of the items
in the unsorted list. The first item is always picked up. As long as there is an item and that
item is equal or larger that the last item found, we can append it. If not, we keep the list
found so far and start again with the smaller item found. This adds a maximum of N  1
compares to the algorithm. If we keep a special case for two items and swap those in order we
make an extra N / 2 compares compared to the situation without using natural order.
On the other hand if the list was already sorted we only need N  1 compares to verify
that and we're done. The best case performance is O(N), and the worst case still O(N.log(N)).
This improvement also works great if the unsorted list contains sorted streaks.
Some Delphi code implementing the sorting algorithm
If you can 'read' Delphi code you can read a quite optimized implementation of the sorting
algorithm below.
TBaseItemComparator = function(ItemA, ItemB: TBaseItem): Boolean;
procedure TBaseList.Sort(LessOrEqual: TBaseItemComparator;
UseNaturalOrder: Boolean);
var
i: Integer;
Item, Next, LastItem, StackItem, Sentinel: TBaseItem;
Stack: array [..] of TBaseItem;
begin
if FCount <= then Exit;
Sentinel := TBaseItem.Create;
try
for i := Low(Stack) to High(Stack) do Stack[i] := nil;
Item := FFirst;
if UseNaturalOrder then begin
repeat
Next := Item.FNext;
if Assigned(Next) then begin
if LessOrEqual(Item,Next) then begin
repeat
StackItem := Next;
Next := Next.FNext;
until (not Assigned(Next)) or (not LessOrEqual(StackItem,Next));
end else begin
StackItem := Item;
Item := Next;
Next := Next.FNext;
Item.FNext := StackItem;
end;
StackItem.FNext := nil;
end else begin
Item.FNext := nil;
end;
i := ;
while Assigned(Stack[i]) do begin
LastItem := Sentinel;
StackItem := Stack[i];
Stack[i] := nil;
repeat
if LessOrEqual(StackItem,Item) then begin
LastItem.FNext := StackItem;
LastItem := StackItem;
StackItem := StackItem.FNext;
if Assigned(StackItem) then Continue;
LastItem.FNext := Item;
Break;
end else begin
LastItem.FNext := Item;
LastItem := Item;
Item := Item.FNext;
if Assigned(Item) then Continue;
LastItem.FNext := StackItem;
Break;
end;
until False;
Item := Sentinel.FNext;
Inc(i);
end;
Stack[i] := Item;
Item := Next;
until not Assigned(Item);
end else begin
repeat
Next := Item.FNext;
Item.FNext := nil;
i := ;
while Assigned(Stack[i]) do begin
LastItem := Sentinel;
StackItem := Stack[i];
Stack[i] := nil;
repeat
if LessOrEqual(StackItem,Item) then begin
LastItem.FNext := StackItem;
LastItem := StackItem;
StackItem := StackItem.FNext;
if Assigned(StackItem) then Continue;
LastItem.FNext := Item;
Break;
end else begin
LastItem.FNext := Item;
LastItem := Item;
Item := Item.FNext;
if Assigned(Item) then Continue;
LastItem.FNext := StackItem;
Break;
end;
until False;
Item := Sentinel.FNext;
Inc(i);
end;
Stack[i] := Item;
Item := Next;
until not Assigned(Item);
end;
i := ;
while not Assigned(Stack[i]) do Inc(i);
Item := Stack[i];
Inc(i);
while i < Length(Stack) do begin
if Assigned(Stack[i]) then begin
LastItem := Sentinel;
StackItem := Stack[i];
Stack[i] := nil;
repeat
if LessOrEqual(StackItem,Item) then begin
LastItem.FNext := StackItem;
LastItem := StackItem;
StackItem := StackItem.FNext;
if Assigned(StackItem) then Continue;
LastItem.FNext := Item;
Break;
end else begin
LastItem.FNext := Item;
LastItem := Item;
Item := Item.FNext;
if Assigned(Item) then Continue;
LastItem.FNext := StackItem;
Break;
end;
until False;
Item := Sentinel.FNext;
end;
Inc(i);
end;
FFirst := Item;
LastItem := nil;
repeat
Item.FPrev := LastItem;
LastItem := Item;
Item := Item.FNext;
until not Assigned(Item);
FLast := LastItem;
finally
Sentinel.Free;
end;
end;
