Skip to content

RESOURCES / BLOG

What are Heaps in Python and How Do You Use Them?

If you read developer threads about scheduling jobs, top-k queries, or building fast priority queues, you will see heaps come up a lot. Python ships with a lightweight heap implementation that is perfect for many everyday tasks like finding the smallest items, merging sorted streams, or implementing Dijkstra’s algorithm in a few lines.

What are heaps in Python and how do you use them? I see the heapq module mentioned for priority queues, top-k lists, and graph algorithms. How do I push and pop items, handle max-heap behavior, break ties safely, and apply heaps to real-world problems?

A heap is a tree-like data structure that maintains a partial order: in a min-heap the smallest element is always at the root. Python’s standard library provides a binary min-heap in the heapq module, implemented on top of a plain list for speed and simplicity.

import heapq

# Build a heap incrementally
h = []
heapq.heappush(h, 5)
heapq.heappush(h, 1)
heapq.heappush(h, 3)

print(heapq.heappop(h))  # 1 (smallest)
print(h)                  # remaining heap structure

# Turn an existing list into a heap in place
data = [9, 3, 7, 1, 6]
heapq.heapify(data)       # O(n)
print(heapq.heappop(data))  # 1

# Get n smallest or largest without fully sorting
nums = [8, 2, 7, 4, 9, 1, 5]
print(heapq.nsmallest(3, nums))  # [1, 2, 4]
print(heapq.nlargest(2, nums))   # [9, 8]Code language: PHP (php)
  • heappush and heappop run in O(log n)
  • heapify runs in O(n)
  • nsmallest and nlargest are optimized for small n

heapq is a min-heap. To simulate a max-heap, store negated priorities or store tuples where the first element orders as you want.

import heapq

# Max-heap via negation
maxh = []
heapq.heappush(maxh, (-10, "low latency"))
heapq.heappush(maxh, (-5, "throughput"))
priority, task = heapq.heappop(maxh)
print(-priority, task)  # 10 low latencyCode language: PHP (php)

When multiple items share the same priority, include a monotonic counter so the heap has a deterministic ordering and does not try to compare the tasks themselves.

import heapq
from itertools import count

pq = []
counter = count()

def push(pq, priority, item):
    # For a max-heap, negate priority
    entry = (priority, next(counter), item)
    heapq.heappush(pq, entry)

def pop(pq):
    priority, _, item = heapq.heappop(pq)
    return priority, item

push(pq, 2, "resize")
push(pq, 1, "ingest")
push(pq, 2, "analyze")
print(pop(pq))  # (1, 'ingest')
print(pop(pq))  # (2, 'resize') then (2, 'analyze')Code language: PHP (php)

Maintain a fixed-size min-heap of size k. Push new items and pop when size exceeds k. The heap then holds the largest k items.

import heapq

def top_k(iterable, k):
    h = []
    for x in iterable:
        if len(h) < k:
            heapq.heappush(h, x)
        else:
            # If x is bigger than the smallest of the top k, replace it
            if x > h[0]:
                heapq.heapreplace(h, x)
    return sorted(h, reverse=True)

print(top_k([5,1,9,3,7,8,2], 3))  # [9, 8, 7]Code language: PHP (php)

heapq.merge lazily merges multiple sorted iterables without loading everything into memory.

import heapq

a = [1, 4, 7]
b = [2, 5, 6]
c = [0, 3, 8]
for x in heapq.merge(a, b, c):
    print(x)Code language: JavaScript (javascript)
import heapq

def dijkstra(graph, start):
    dist = {start: 0}
    pq = [(0, start)]
    while pq:
        d, u = heapq.heappop(pq)
        if d != dist.get(u, float("inf")):
            continue
        for v, w in graph[u]:
            nd = d + w
            if nd < dist.get(v, float("inf")):
                dist[v] = nd
                heapq.heappush(pq, (nd, v))
    return distCode language: JavaScript (javascript)
  • Do not modify items in place after pushing. Remove and reinsert or push a new entry and mark the old one as invalid if needed.
  • When items are non-comparable, add a counter to avoid TypeError on ties.
  • For max-heap behavior, remember to negate or invert your priority.

In media pipelines you might prioritize work like ingest, analyze, transform, and deliver. A heap-backed priority queue lets you process urgent jobs first while keeping throughput high. For example, schedule time-sensitive transforms before batch work and integrate with Python image processing steps like Python-based analysis or file handling tips from saving images in Python.

If your pipeline includes format conversion or pre-optimization during backlog spikes, you can route tasks that trigger a tool like PNG compression at a lower priority than user-facing requests. 

  • heapq implements a fast min-heap with O(log n) push and pop, O(n) heapify.
  • Use tuples (priority, counter, item) to build stable priority queues.
  • Simulate a max-heap by negating priorities or inverting comparisons.
  • Apply heaps to top-k selection, merging sorted streams, and shortest path algorithms.
  • In media workflows, heaps help prioritize processing tasks alongside Python image work and Cloudinary tooling.

Ready to streamline your media pipeline and deliver optimized assets at scale? Sign up for a free Cloudinary account and start transforming, optimizing, and delivering your content today.

Start Using Cloudinary

Sign up for our free plan and start creating stunning visual experiences in minutes.

Sign Up for Free