Python Study
Insertion Sort
Mastering Insertion Sort in Python: From Implementation to Optimization
Introduction
Insertion sort is a simple yet fundamental sorting algorithm that builds the final sorted array one item at a time. While less efficient for large datasets than advanced algorithms like quicksort or merge sort, it excels in simplicity and efficiency for small or nearly sorted data. This article explores the insertion sort algorithm, its implementation in Python, and optimization strategies.
Problem Statement
Given:
- An array of integers
Objective:
- Return a sorted version of that array
Example:
- Input:
[5, 2, 8, 12, 1, 6]
- Output:
[1, 2, 5, 6, 8, 12]
Assumption:
- We'll use the insertion sort algorithm to perform the sort
Strategy and Hypothesis
Insertion sort divides the input into sorted and unsorted subarrays. The sorted subarray initially contains the first element, while the unsorted subarray contains the rest. The algorithm takes an element from the unsorted portion and inserts it into its correct position in the sorted portion.
For a visual explanation: Insertion sort in 2 minutes
Implementation
Initial Attempt
For understanding array slicing: NumPy Array Slicing For learning about list insertion: Python List
insert()
def insertionSort(array):
solution = array[0:1] # note the end index is non-inclusive
temp = array[1:] # note no end index implies to end of the array
while len(temp) > 0:
num = temp.pop()
for idx, value in enumerate(solution):
if num < value:
solution.insert(idx, num)
break
return solution
This initial approach uses additional lists, increasing space complexity.
Improved Approach
def insertionSort(array):
solution = array[0:1]
temp = array[1:]
while len(temp) > 0:
num = temp.pop()
if num >= solution[-1]: # handle append
solution.append(num)
continue
for idx, value in enumerate(solution): # handle insert
if num <= value:
solution.insert(idx, num)
break
return solution
This version handles edge cases like appending larger numbers and preserving duplicates.
Optimization Analysis
Optimizing for Space
The initial implementation used additional lists, which increased space complexity. The optimized solution below performs the sort in place, reducing space complexity to O(1).
Optimizing for Time
Insertion sort's worst-case time complexity is O(n^2). While this can't be improved for insertion sort, the optimized solution below reduces the number of operations by directly shifting elements in place rather than using slicing and the insert
method.
Optimal Space & Time
This solution handles insertion, as discussed in the "Strategy and Hypothesis" section. We'll hold the number in a variable and continually shift the values in the array rightward until the number is greater than the current value or we are at the front of the array.
def insertionSort(array):
for i in range(1, len(array)):
num = array[i]
j = i - 1
while j >= 0 and num < array[j]: # handle insert
array[j + 1] = array[j]
j -= 1
array[j + 1] = num # handle append
return array
Key Takeaways
- Insertion sort is simple to implement but inefficient for large datasets.
- In-place sorting significantly improves space efficiency.
- While worst-case time complexity remains O(n^2), optimizing for nearly sorted arrays can approach O(n) in the best case.
- Insertion sort is particularly efficient for small or nearly sorted datasets.
- Understanding basic sorting algorithms provides a foundation for more complex algorithmic concepts.
Conclusion
While not suitable for large-scale applications, insertion sort remains a valuable algorithm for its simplicity and efficiency in specific scenarios. Through this exploration, we've seen how initial implementations can be iteratively improved for both space and time efficiency. This optimization process is a crucial skill in algorithm design and software development. As you continue your programming journey, remember that even simple algorithms like insertion sort can provide deep insights into the principles of efficient code design and optimization strategies.