What is Timsort?

Overview

Timsort is a hybrid sorting algorithm. It is used to give optimal results while dealing with real-world data.

Timsort has been derived from Insertion Sort and Binary Merge Sort. At first, the array is stored into smaller chunks of data known as runs. These runs are sorted using insertion sort and then merged using the merge sort.

Algorithm

Step 1: The array is broken down into runs.

Preferably, the size of the run (minrun) should be such that (n/minrun) is a power of 2 (since merge sort gives optimal results in that case).

Step 2: Initialize the size of the run as 32 or 64.

Step 3: Sort every run using insertion sort. If the array size is smaller than 64, then it is just sorted using insertion sort.

Step 4: Merge all the runs one by one using merge sort.

Step 5: After every iteration, double the size of the merged sub-array.

Visual representation

Here is a visual representation of the Timsort. For the sake of simplicity, the minrun is of length 4.

C++ implementation example

#include <iostream>
using namespace std;
const int RUN = 64;
int min(int x, int y)
{
if(x<y)
{
return x;
}
else
{
return y;
}
}
// insertion sort algorithm
void insertionSort(int* array, int start_index, int end_index)
{
for (int i = start_index + 1, j , temp; i <= end_index; i++)
{
temp = array[i];
j = i - 1;
while(j>=0 && temp <= array[j])
{
array[j+1] = array[j];
j = j-1;
}
array[j+1] = temp;
}
}
// merge function
void merge(int* array, int start_index, int mid, int end_index)
{
int i, j, k;
int left_size = mid - start_index + 1;
int right_size = end_index - mid;
int left_array[left_size], right_array[right_size];
for (int i = 0; i < left_size; i++)
{
left_array[i] = array[start_index + i];
}
for (int j = 0; j < right_size; j++)
{
right_array[j] = array[mid + 1 + j];
}
i = 0, j=0 , k = start_index;
for (; i < left_size && j < right_size; k++)
{
if(left_array[i] <= right_array[j])
{
array[k] = left_array[i];
i++;
}
else
{
array[k] = right_array[j];
j++;
}
}
for (; i<left_size ;i++ , k++)
{
array[k] = left_array[i];
}
for (; j<right_size; j++ , k++)
{
array[k] = right_array[j];
}
}
// timsort
void timsort(int* array, int size)
{
for (int i = 0; i < size; i=i +RUN)
{
insertionSort(array, i, min((i+RUN-1), (size-1)));
}
for (int n = RUN; n < size; n*= 2)
{
for (int start_index = 0; start_index < size; start_index += 2*n)
{
int mid = start_index + n - 1;
int end_index = min((start_index + 2*n - 1),(size-1));
if(mid < end_index)
{
merge(array, start_index, mid, end_index);
}
}
}
}
int main()
{ int* array ;
int size ;
cout << "Enter size:" << endl ;
cin >> size ;
array= new int[size] ;
cout << "Enter elements:" << endl ;
for (int i=0 ;i < size ;i++)
{
cin >> array[i] ;
}
timsort (array, size) ; // calling tim sort function
cout << "Sorted Array:" << endl ;
for (int i=0 ;i < size ;i++)
cout << array[i] << " " ;
cout << endl ;
return 0;
}

Explanation

  • Line 3: We initialize the value of RUN with a value 64.
  • Lines 97–107: We take the size and elements of an array as an input from the user and call the timsort(array, size) function.
  • Lines 17–31: We use insertion sort to sort all the runs. In case if the size is smaller than 64 then only one run is made and is sorted using insertion sort.
  • Lines 33–94: Merge all the sorted runs.

Python Implementation

Timsort is actually built into Python, so to use the Tim sort, we just have to write either of the given two lines of code:

# you can use one of the following code lines
#option 1
list.sort()
#option 2
sorted(list)

Time Complexity

Best Case

O(n)

Average Case

O(n log n)

Worst Case

O(n log n)

Space Complexity

O(n)

Advantages

  • Stable Algorithm
  • Adaptive Algorithm (handles input data effectively based on a size of the input)
  • Deals with real-world data