Binary Search
A much faster search — but the list must be sorted first.
Binary search is a clever way to find an item by repeatedly halving the part of the list you are still looking in. It is far faster than a linear search on big lists — but it only works if the list is sorted first. Take your time with this one; it rewards a careful read.
The idea: halve the problem each time
Think about finding a word in a dictionary. You do not start at page 1 and read every word. You open it near the middle, see whether your word comes before or after, and throw away the half it cannot be in. You repeat that until you land on it. Binary search does exactly this.
The 'higher or lower' guessing game is the same trick: guess the middle number, find out if the answer is higher or lower, and you have instantly ruled out half the possibilities.
First, sort the list with sorted()
Because binary search needs order, you usually sort the list first. Python's sorted() function takes a list and returns a new list with the items in order, smallest to largest.
scores = [56, 12, 91, 23, 8]
ordered = sorted(scores)
print(ordered)
print(scores)[8, 12, 23, 56, 91] [56, 12, 91, 23, 8]
sorted()returns a brand-new list with the values in ascending order.- It does not change the original —
scoresis still in its first order afterwards. - Store the result in a variable (here
ordered) so you can use the sorted version.
sorted() works on text too, putting strings into alphabetical order.
names = ["Cara", "Ava", "Ben"]
print(sorted(names))['Ava', 'Ben', 'Cara']
sorted(list) hands back a new sorted list and leaves the original alone. There is also a .sort() method that re-orders the original list in place. For now, sorted() is the clearer choice.How binary search works
Binary search keeps track of the part of the list it is still searching, using two markers: low (the first index) and high (the last index). Each pass it checks the middle item.
- Find the middle index:
mid = (low + high) // 2—//keeps it a whole number. - If the middle item equals the target, you have found it — stop and return
mid. - If the target is bigger than the middle item, it must be in the right half, so move
lowtomid + 1. - If the target is smaller, it must be in the left half, so move
hightomid - 1. - Repeat while
low <= high. Iflowever passeshigh, the item is not there.
A worked example
Let us search this sorted list of 10 numbers for the target 23. The indexes run from 0 to 9.
numbers = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
# 0 1 2 3 4 5 6 7 8 9
target = 23Follow the markers as the search narrows. low and high close in until mid lands on the target:
| Step | low | high | mid | numbers[mid] | What happens |
|---|---|---|---|---|---|
| 1 | 0 | 9 | 4 | 16 | 16 < 23 → search the right half, set low = 5 |
| 2 | 5 | 9 | 7 | 56 | 56 > 23 → search the left half, set high = 6 |
| 3 | 5 | 6 | 5 | 23 | 23 = 23 → found at index 5 |
It found 23 in just three checks. A linear search would have checked items 0, 1, 2, 3, 4 and 5 — six checks — to reach the same place.
Binary search in Python
def binary_search(items, target):
low = 0
high = len(items) - 1
while low <= high:
mid = (low + high) // 2
if items[mid] == target:
return mid
elif items[mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
numbers = [2, 5, 8, 12, 16, 23, 38, 56, 72, 91]
print(binary_search(numbers, 23))
print(binary_search(numbers, 40))5 -1
lowandhighstart at the two ends of the list.- The loop runs while there is still a section left to search (
low <= high). - Each pass works out
midand compares the middle item with the target. - Finding the target returns its index straight away;
returnends the function. - 40 is not in the list, so
loweventually passeshigh, the loop ends and it returns-1.
Sort, then search
In real use the data may not arrive in order, so sort it first, then search the sorted version.
data = [56, 12, 91, 23, 8]
data = sorted(data)
print(data)
print(binary_search(data, 23))[8, 12, 23, 56, 91] 2
sorted()puts the data in order:[8, 12, 23, 56, 91].- 23 now sits at index 2 in the sorted list, so that is what is returned.
- Always search the sorted list — the index refers to that order, not the original.
Binary search vs linear search
Both find an item in a list, but they work very differently and suit different situations.
| Linear search | Binary search | |
|---|---|---|
| Works on | Any list | Sorted lists only |
| Method | Check each item from the start | Halve the search area each time |
| Worst case (n items) | Up to n checks | About log₂n checks |
| Speed on big lists | Slow | Very fast |
The difference grows enormous as the list gets bigger. Each extra check by binary search can cover roughly double the items:
| List size | Linear (worst case) | Binary (worst case) |
|---|---|---|
| 10 items | 10 checks | 4 checks |
| 1,000 items | 1,000 checks | 10 checks |
| 1,000,000 items | 1,000,000 checks | 20 checks |
What you have learned
- Binary search repeatedly halves the search area, so it is very fast on large lists.
- It only works on a sorted list — use
sorted()to put items in order first. - It tracks the section left to search with
lowandhigh, checking the middle each time. - It returns the index if found, or
-1iflowpasseshigh. - Linear search works on any list but is slower; binary search is faster but needs sorting.