24 KiB
Sorting Algorithms
In computer science, a sorting algorithm takes a collection of items and arranges them in a specific order. This order is usually determined by comparing the items using a defined rule.
Real Life Example of Sorting
- Sorting a deck of cards
- Sorting names in alphabetical order
- Sorting a list of items, etc.
Some common sorting techniques:
1. Bubble Sort
Bubble sort is a basic sorting technique that iteratively steps through a list, comparing neighboring elements. If elements are out of order, it swaps them. While easy to understand, bubble sort becomes inefficient for large datasets due to its slow execution time.
Algorithm Overview:
- Pass by Pass: During each pass, the algorithm iterates through the list.
- Comparing Neighbors: In each iteration, it compares adjacent elements in the list.
- Swapping for Order: If the elements are in the wrong order (typically, the first being larger than the second), it swaps their positions.
- Bubbling Up the Largest: This swapping process effectively pushes the largest element encountered in a pass towards the end of the list, like a bubble rising in water.
- Repeating Until Sorted: The algorithm continues making passes through the list until no more swaps are needed. This indicates the entire list is sorted.
Bubble Sort Code in Python
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n-i-1):
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
arr = [5, 3, 8, 1, 2]
bubble_sort(arr)
print("Sorted array:", arr) # Output: [1, 2, 3, 5, 8]
Example with Visualization
Let's sort the list [5, 3, 8, 1, 2]
using bubble sort.
-
Pass 1:
- Comparing neighbors:
[3, 5, 1, 2, 8]
- Swapping:
[3, 5, 1, 2, 8]
→[3, 1, 5, 2, 8]
→[3, 1, 2, 5, 8]
- Result:
[3, 1, 2, 5, 8]
- Comparing neighbors:
-
Pass 2:
- Comparing neighbors:
[1, 3, 2, 5, 8]
- Swapping:
[1, 3, 2, 5, 8]
→[1, 2, 3, 5, 8]
- Result:
[1, 2, 3, 5, 8]
- Comparing neighbors:
-
Pass 3:
- Comparing neighbors:
[1, 2, 3, 5, 8]
- No swapping needed, the list is already sorted.
- Comparing neighbors:
Complexity Analysis
- Worst Case:
O(n^2)
comparisons and swaps. This happens when the list is in reverse order, and we need to make maximum swaps. - Best Case:
O(n)
comparisons. This occurs when the list is already sorted, but we still need O(n^2) swaps because of the nested loops. - Average Case:
O(n^2)
comparisons and swaps. This is the expected number of comparisons and swaps over all possible input sequences.
2. Selection Sort
Selection sort is a simple sorting algorithm that divides the input list into two parts: a sorted sublist and an unsorted sublist. The algorithm repeatedly finds the smallest (or largest, depending on sorting order) element from the unsorted sublist and moves it to the sorted sublist. It's not efficient for large datasets but performs better than bubble sort due to fewer swaps.
Algorithm Overview:
- Initial State: The entire list is considered unsorted initially.
- Selecting the Minimum: The algorithm repeatedly selects the smallest element from the unsorted sublist and moves it to the sorted sublist.
- Expanding the Sorted Sublist: As elements are moved to the sorted sublist, it expands until all elements are sorted.
- Repeating Until Sorted: The process continues until the entire list is sorted.
Example with Visualization
Let's sort the list [5, 3, 8, 1, 2]
using selection sort.
-
Pass 1:
- Initial list:
[5, 3, 8, 1, 2]
- Find the minimum:
1
- Swap with the first element:
[1, 3, 8, 5, 2]
- Initial list:
-
Pass 2:
- Initial list:
[1, 3, 8, 5, 2]
- Find the minimum:
2
- Swap with the second element:
[1, 2, 8, 5, 3]
- Initial list:
-
Pass 3:
- Initial list:
[1, 2, 8, 5, 3]
- Find the minimum:
3
- Swap with the third element:
[1, 2, 3, 5, 8]
- Initial list:
-
Pass 4:
- Initial list:
[1, 2, 3, 5, 8]
- Find the minimum:
5
- No swapping needed, the list is already sorted.
- Initial list:
Selection Sort Code in Python
def selection_sort(arr):
n = len(arr)
for i in range(n):
min_index = i
for j in range(i+1, n):
if arr[j] < arr[min_index]:
min_index = j
arr[i], arr[min_index] = arr[min_index], arr[i]
arr = [5, 3, 8, 1, 2]
selection_sort(arr)
print("Sorted array:", arr) # Output: [1, 2, 3, 5, 8]
Complexity Analysis
- Worst Case:
O(n^2)
comparisons and O(n) swaps. This occurs when the list is in reverse order, and we need to make maximum comparisons and swaps. - Best Case:
O(n^2)
comparisons and O(n) swaps. This happens when the list is in sorted order, but the algorithm still needs to iterate through all elements for comparisons. - Average Case:
O(n^2)
comparisons and O(n) swaps. This is the expected number of comparisons and swaps over all possible input sequences.
3. Quick Sort
Quick sort is a popular divide-and-conquer sorting algorithm known for its efficiency on average. It works by selecting a 'pivot' element from the array and partitioning the other elements into two sub-arrays according to whether they are less than or greater than the pivot. The sub-arrays are then recursively sorted.
Algorithm Overview:
- Pivot Selection: Choose a pivot element from the array. Common strategies include selecting the first, last, middle, or a randomly chosen element.
- Partitioning: Rearrange the array so that all elements less than the pivot are on its left, and all elements greater than the pivot are on its right. This step ensures that the pivot element is placed in its correct sorted position.
- Recursion: Apply the above steps recursively to the sub-arrays formed by partitioning until the base case is reached. The base case is usually when the size of the sub-array becomes 0 or 1, indicating it is already sorted.
- Base Case: If the sub-array size becomes 0 or 1, it is already sorted.
Example with Visualization
Let's sort the list [5, 3, 8, 1, 2]
using quick sort.
- Initial Array:
[5, 3, 8, 1, 2]
- Choose Pivot: Let's choose the last element,
2
, as the pivot. - Partitioning:
- We'll partition the array around the pivot
2
. All elements less than2
will be placed to its left, and all elements greater than2
will be placed to its right. - After partitioning, the array becomes
[1, 2, 5, 3, 8]
. The pivot element,2
, is now in its correct sorted position.
- We'll partition the array around the pivot
- Recursion:
- Now, we recursively sort the sub-arrays
[1]
and[5, 3, 8]
.- For the sub-array
[5, 3, 8]
, we choose8
as the pivot and partition it. - After partitioning, the sub-array becomes
[3, 5, 8]
. The pivot element,8
, is now in its correct sorted position.
- For the sub-array
- Now, we recursively sort the sub-arrays
- Concatenation:
- Concatenating the sorted sub-arrays
[1]
,[2]
,[3, 5, 8]
, we get the final sorted array[1, 2, 3, 5, 8]
.
- Concatenating the sorted sub-arrays
Quick Sort Code in Python (Iterative)
def partition(arr, low, high):
pivot = arr[high]
i = low - 1
for j in range(low, high):
if arr[j] < pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return i + 1
def quick_sort_iterative(arr):
stack = [(0, len(arr) - 1)]
while stack:
low, high = stack.pop()
if low < high:
pi = partition(arr, low, high)
stack.append((low, pi - 1))
stack.append((pi + 1, high))
# Example usage:
arr = [38, 27, 43, 3, 9, 82, 10]
quick_sort_iterative(arr)
print("Sorted array:", arr) # Output: [3, 9, 10, 27, 38, 43, 82]
Quick Sort Code in Python (Recursive)
def quick_sort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[-1]
left = [x for x in arr[:-1] if x < pivot]
right = [x for x in arr[:-1] if x >= pivot]
return quick_sort(left) + [pivot] + quick_sort(right)
arr = [5, 3, 8, 1, 2]
sorted_arr = quick_sort(arr)
print("Sorted array:", sorted_arr) # Output: [1, 2, 3, 5, 8]
Complexity Analysis
- Worst Case: The worst-case time complexity of quick sort is
O(n^2)
. This occurs when the pivot selection consistently results in unbalanced partitioning, such as choosing the smallest or largest element as the pivot. -Best Case: The best-case time complexity isO(n log n)
. This happens when the pivot selection leads to well-balanced partitioning, halving the array size in each recursive call. - Average Case: The average-case time complexity is
O(n log n)
. This is the expected time complexity when the pivot selection results in reasonably balanced partitioning across recursive calls. - Space Complexity: Quick sort has an
O(log n)
space complexity for the recursion stack, as it recursively sorts sub-arrays.
4. Merge Sort
Merge sort is a divide-and-conquer algorithm that recursively divides the input list into smaller sublists until each sublist contains only one element. Then, it repeatedly merges adjacent sublists while maintaining the sorted order until there is only one sublist remaining, which represents the sorted list.
Algorithm Overview:
- Divide: Split the input list into smaller sublists recursively until each sublist contains only one element.
- Merge: Repeatedly merge adjacent sublists while maintaining the sorted order until there is only one sublist remaining, which represents the sorted list.
Example with Visualization
Let's sort the list [38, 27, 43, 3, 9, 82, 10]
using merge sort.
-
Initial Division:
- Divide the list into sublists:
[38, 27, 43, 3, 9, 82, 10]
- Visually it looks like
[38], [27], [43], [3], [9], [82], [10]
- Divide the list into sublists:
-
Merge Passes:
- Merge adjacent sublists while maintaining sorted order:
- Pass 1:
[27, 38]
,[3, 43]
,[9, 82]
,[10]
- Pass 2:
[3, 27, 38, 43]
,[9, 10, 82]
- Pass 3:
[3, 9, 10, 27, 38, 43, 82]
- Pass 1:
- Merge adjacent sublists while maintaining sorted order:
-
Final Sorted List:
[3, 9, 10, 27, 38, 43, 82]
Merge Sort Code in Python (Iterative)
def merge_sort_iterative(arr):
n = len(arr)
curr_size = 1
while curr_size < n:
left = 0
while left < n - 1:
mid = min(left + curr_size - 1, n - 1)
right = min(left + 2 * curr_size - 1, n - 1)
merge(arr, left, mid, right)
left += 2 * curr_size
curr_size *= 2
def merge(arr, left, mid, right):
n1 = mid - left + 1
n2 = right - mid
L = [0] * n1
R = [0] * n2
for i in range(n1):
L[i] = arr[left + i]
for j in range(n2):
R[j] = arr[mid + 1 + j]
i = j = 0
k = left
while i < n1 and j < n2:
if L[i] <= R[j]:
arr[k] = L[i]
i += 1
else:
arr[k] = R[j]
j += 1
k += 1
while i < n1:
arr[k] = L[i]
i += 1
k += 1
while j < n2:
arr[k] = R[j]
j += 1
k += 1
arr = [38, 27, 43, 3, 9, 82, 10]
merge_sort_iterative(arr)
print("Sorted array:", arr) # Output: [3, 9, 10, 27, 38, 43, 82]
Merge Sort Code in Python (Recursive)
def merge_sort(arr):
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
merge_sort(left_half)
merge_sort(right_half)
i = j = k = 0
# Merge the two sorted halves
while i < len(left_half) and j < len(right_half):
if left_half[i] < right_half[j]:
arr[k] = left_half[i]
i += 1
else:
arr[k] = right_half[j]
j += 1
k += 1
# Check if any elements are remaining in the left half
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Check if any elements are remaining in the right half
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Example usage:
arr = [38, 27, 43, 3, 9, 82, 10]
merge_sort(arr)
print("Sorted array:", arr) # Output: [3, 9, 10, 27, 38, 43, 82]
Complexity Analysis
- Time Complexity:
O(n log n)
for all cases. Merge sort always divides the list into halves until each sublist contains only one element, and then merges them back together, resulting in O(n log n) time complexity. - Space Complexity:
O(n)
auxiliary space. In the iterative version, merge sort uses additional space for creating temporary sublists during merging operations.
5. Insertion Sort
Insertion sort is a straightforward and efficient sorting algorithm for small datasets. It builds the final sorted array one element at a time. It is much like sorting playing cards in your hands: you take one card at a time and insert it into its correct position among the already sorted cards.
Algorithm Overview:
- Start from the Second Element: Begin with the second element, assuming the first element is already sorted.
- Compare with Sorted Subarray: Take the current element and compare it with elements in the sorted subarray (the part of the array before the current element).
- Insert in Correct Position: Shift all elements in the sorted subarray that are greater than the current element to one position ahead. Insert the current element into its correct position.
- Repeat Until End: Repeat this process for all elements in the array.
Example with Visualization
Let's sort the list [5, 3, 8, 1, 2]
using insertion sort.
Step-by-Step Visualization:
Initial List: [5, 3, 8, 1, 2]
-
Pass 1:
- Current element: 3
- Compare 3 with 5, move 5 to the right:
[5, 5, 8, 1, 2]
- Insert 3 in its correct position:
[3, 5, 8, 1, 2]
-
Pass 2:
- Current element: 8
- 8 is already in the correct position:
[3, 5, 8, 1, 2]
-
Pass 3:
- Current element: 1
- Compare 1 with 8, move 8 to the right:
[3, 5, 8, 8, 2]
- Compare 1 with 5, move 5 to the right:
[3, 5, 5, 8, 2]
- Compare 1 with 3, move 3 to the right:
[3, 3, 5, 8, 2]
- Insert 1 in its correct position:
[1, 3, 5, 8, 2]
-
Pass 4:
- Current element: 2
- Compare 2 with 8, move 8 to the right:
[1, 3, 5, 8, 8]
- Compare 2 with 5, move 5 to the right:
[1, 3, 5, 5, 8]
- Compare 2 with 3, move 3 to the right:
[1, 3, 3, 5, 8]
- Insert 2 in its correct position:
[1, 2, 3, 5, 8]
Insertion Sort Code in Python
def insertion_sort(arr):
# Traverse from 1 to len(arr)
for i in range(1, len(arr)):
key = arr[i]
# Move elements of arr[0..i-1], that are greater than key,
# to one position ahead of their current position
j = i - 1
while j >= 0 and key < arr[j]:
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
# Example usage
arr = [5, 3, 8, 1, 2]
insertion_sort(arr)
print("Sorted array:", arr) # Output: [1, 2, 3, 5, 8]
Complexity Analysis
- Worst Case:
𝑂(𝑛^2)
comparisons and swaps. This occurs when the array is in reverse order. - Best Case:
𝑂(𝑛)
comparisons and𝑂(1)
swaps. This happens when the array is already sorted. - Average Case:
𝑂(𝑛^2)
comparisons and swaps. This is the expected number of comparisons and swaps over all possible input sequences.
6. Heap Sort
Heap Sort is an efficient comparison-based sorting algorithm that uses a binary heap data structure. It divides its input into a sorted and an unsorted region and iteratively shrinks the unsorted region by extracting the largest (or smallest) element and moving it to the sorted region.
Algorithm Overview:
- Build a Max Heap: Convert the array into a max heap, a complete binary tree where the value of each node is greater than or equal to the values of its children.
- Heapify: Ensure that the subtree rooted at each node satisfies the max heap property. This process is called heapify.
- Extract Maximum: Swap the root (the maximum element) with the last element of the heap and reduce the heap size by one. Restore the max heap property by heapifying the root.
- Repeat: Continue extracting the maximum element and heapifying until the entire array is sorted.
Example with Visualization
Let's sort the list [5, 3, 8, 1, 2]
using heap sort.
-
Build Max Heap:
- Initial array:
[5, 3, 8, 1, 2]
- Start heapifying from the last non-leaf node.
- Heapify at index 1:
[5, 3, 8, 1, 2]
(no change, children are already less than the parent) - Heapify at index 0:
[8, 3, 5, 1, 2]
(swap 5 and 8 to make 8 the root)
- Initial array:
-
Heapify Process:
- Heapify at index 0:
[8, 3, 5, 1, 2]
(no change needed, already a max heap)
- Heapify at index 0:
-
Extract Maximum:
- Swap root with the last element:
[2, 3, 5, 1, 8]
- Heapify at index 0:
[5, 3, 2, 1, 8]
(swap 2 and 5)
- Swap root with the last element:
-
Repeat Extraction:
- Swap root with the second last element:
[1, 3, 2, 5, 8]
- Heapify at index 0:
[3, 1, 2, 5, 8]
(swap 1 and 3) - Swap root with the third last element:
[2, 1, 3, 5, 8]
- Heapify at index 0:
[2, 1, 3, 5, 8]
(no change needed) - Swap root with the fourth last element:
[1, 2, 3, 5, 8]
- Swap root with the second last element:
After all extractions, the array is sorted: [1, 2, 3, 5, 8]
.
Heap Sort Code in Python
def heapify(arr, n, i):
largest = i # Initialize largest as root
left = 2 * i + 1 # left child index
right = 2 * i + 2 # right child index
# See if left child of root exists and is greater than root
if left < n and arr[largest] < arr[left]:
largest = left
# See if right child of root exists and is greater than root
if right < n and arr[largest] < arr[right]:
largest = right
# Change root, if needed
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i] # swap
# Heapify the root.
heapify(arr, n, largest)
def heap_sort(arr):
n = len(arr)
# Build a maxheap.
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# One by one extract elements
for i in range(n - 1, 0, -1):
arr[i], arr[0] = arr[0], arr[i] # swap
heapify(arr, i, 0)
# Example usage
arr = [5, 3, 8, 1, 2]
heap_sort(arr)
print("Sorted array:", arr) # Output: [1, 2, 3, 5, 8]
Complexity Analysis
- Worst Case:
𝑂(𝑛log𝑛)
. Building the heap takes𝑂(𝑛)
time, and each of the 𝑛 element extractions takes𝑂(log𝑛)
time. - Best Case:
𝑂(𝑛log𝑛)
. Even if the array is already sorted, heap sort will still build the heap and perform the extractions. - Average Case:
𝑂(𝑛log𝑛)
. Similar to the worst-case, the overall complexity remains𝑂(𝑛log𝑛)
because each insertion and deletion in a heap takes𝑂(log𝑛)
time, and these operations are performed 𝑛 times.
7. Radix Sort
Radix Sort is a non-comparative integer sorting algorithm that sorts numbers by processing individual digits. It processes digits from the least significant digit (LSD) to the most significant digit (MSD) or vice versa. This algorithm is efficient for sorting numbers with a fixed number of digits.
Algorithm Overview:
- Digit by Digit sorting: Radix sort processes the digits of the numbers starting from either the least significant digit (LSD) or the most significant digit (MSD). Typically, LSD is used.
- Stable Sort: A stable sorting algorithm like Counting Sort or Bucket Sort is used as an intermediate sorting technique. Radix Sort relies on this stability to maintain the relative order of numbers with the same digit value.
- Multiple passes: The algorithm performs multiple passes over the numbers, one for each digit, from the least significant to the most significant.
Radix Sort Code in Python
def counting_sort(arr, exp):
n = len(arr)
output = [0] * n
count = [0] * 10
for i in range(n):
index = arr[i] // exp
count[index % 10] += 1
for i in range(1, 10):
count[i] += count[i - 1]
i = n - 1
while i >= 0:
index = arr[i] // exp
output[count[index % 10] - 1] = arr[i]
count[index % 10] -= 1
i -= 1
for i in range(n):
arr[i] = output[i]
def radix_sort(arr):
max_num = max(arr)
exp = 1
while max_num // exp > 0:
counting_sort(arr, exp)
exp *= 10
# Example usage
arr = [170, 45, 75, 90]
print("Original array:", arr)
radix_sort(arr)
print("Sorted array:", arr)
Complexity Analysis
- Time Complexity: O(d * (n + k)) for all cases. Radix Sort always processes each digit of every number in the array.
- Space Complexity: O(n + k). This is due to the space required for:
- The output array used in Counting Sort, which is of size n.
- The count array used in Counting Sort, which is of size k.
8. Counting Sort
Counting sort is a sorting technique based on keys between a specific range. It works by counting the number of objects having distinct key values (kind of hashing). Then do some arithmetic to calculate the position of each object in the output sequence.
Algorithm Overview:
- Convert the input string into a list of characters.
- Count the occurrence of each character in the list using the collections.Counter() method.
- Sort the keys of the resulting Counter object to get the unique characters in the list in sorted order.
- For each character in the sorted list of keys, create a list of repeated characters using the corresponding count from the Counter object.
- Concatenate the lists of repeated characters to form the sorted output list.
Counting Sort Code in Python using counter method.
from collections import Counter
def counting_sort(arr):
count = Counter(arr)
output = []
for c in sorted(count.keys()):
output += * count
return output
arr = "geeksforgeeks"
arr = list(arr)
arr = counting_sort(arr)
output = ''.join(arr)
print("Sorted character array is", output)
Counting Sort Code in Python using sorted() and reduce():
from functools import reduce
string = "geeksforgeeks"
sorted_str = reduce(lambda x, y: x+y, sorted(string))
print("Sorted string:", sorted_str)
Complexity Analysis
- Time Complexity: O(n+k) for all cases.No matter how the elements are placed in the array, the algorithm goes through n+k times
- Space Complexity: O(max). Larger the range of elements, larger is the space complexity.
9. Cyclic Sort
Theory
Cyclic Sort is an in-place sorting algorithm that is useful for sorting arrays where the elements are in a known range (e.g., 1 to N). The key idea behind the algorithm is that each number should be placed at its correct index. If we find a number that is not at its correct index, we swap it with the number at its correct index. This process is repeated until every number is at its correct index.
Algorithm
- Iterate over the array from the start to the end.
- For each element, check if it is at its correct index.
- If it is not at its correct index, swap it with the element at its correct index.
- Continue this process until the element at the current index is in its correct position. Move to the next index and repeat the process until the end of the array is reached.
Steps
- Start with the first element.
- Check if it is at the correct index (i.e., if arr[i] == i + 1).
- If it is not, swap it with the element at the index arr[i] - 1.
- Repeat step 2 for the current element until it is at the correct index.
- Move to the next element and repeat the process.
Code
def cyclic_sort(nums):
i = 0
while i < len(nums):
correct_index = nums[i] - 1
if nums[i] != nums[correct_index]:
nums[i], nums[correct_index] = nums[correct_index], nums[i] # Swap
else:
i += 1
return nums
Example
arr = [3, 1, 5, 4, 2]
sorted_arr = cyclic_sort(arr)
print(sorted_arr)
Output
[1, 2, 3, 4, 5]
Complexity Analysis
Time Complexity:
The time complexity of Cyclic Sort is O(n). This is because in each cycle, each element is either placed in its correct position or a swap is made. Since each element is swapped at most once, the total number of swaps (and hence the total number of operations) is linear in the number of elements.
Space Complexity:
The space complexity of Cyclic Sort is O(1). This is because the algorithm only requires a constant amount of additional space beyond the input array.