Skip to main content

Command Palette

Search for a command to run...

Heaps & Priority Queue in DSA

Learn Heaps & Priority Queues in DSA using Python

Published
6 min read
Heaps & Priority Queue in DSA

A Heap is a specialized tree-based data structure that satisfies the Heap Property and is mainly used to implement Priority Queues and efficient sorting algorithms like Heap Sort.

It is one of the most frequently asked topics in DSA interviews.


What is a Heap?

A heap is a Complete Binary Tree, which means:

  • All levels are completely filled

  • The last level is filled from left to right

Heaps are typically implemented using arrays, not pointer-based trees.


Heap Property

There are two types of heaps:

🔹 Min-Heap

  • Parent node is smaller than its children

  • The smallest element is always at the root

🔹 Max-Heap

  • Parent node is greater than its children

  • The largest element is always at the root

In Python, the heapq module implements a Min-Heap by default.


Array Representation of Heap

For a node at index i (0-based indexing):

  • Left child → 2*i + 1

  • Right child → 2*i + 2

  • Parent → (i - 1) // 2

This is why heaps are implemented efficiently using arrays.


Common Heap Operations & Time Complexity

Operation Time Complexity
Get Min/Max O(1)
Insert O(log n)
Remove Top O(log n)
Heapify (Build Heap) O(n)
Heap Sort O(n log n)

Using Heap in Python (heapq)


1️⃣ Build Min Heap (Heapify)

Time: O(n)
Space: O(1) (in-place)

import heapq

A = [-4, 3, 1, 0, 2, 5, 10, 8, 12, 9]
heapq.heapify(A) #O(n)

print(A)

heapify() converts a list into a valid heap in linear time.

⚡ Interview Insight:
Building a heap using repeated insertion takes O(n log n), but heapify() does it in O(n).


2️⃣ Insert into Heap

Time: O(log n)

heapq.heappush(A, 4)
print(A)

3️⃣ Extract Minimum

Time: O(log n)

min_val = heapq.heappop(A)
print(A, min_val)

4️⃣ Peek Minimum (Without Removing)

Time: O(1)

A[0]

The smallest element is always at index 0.


5️⃣ Heappush + Heappop Combined

Time: O(log n)

heapq.heappushpop(A, 99)
print(A)

More efficient than doing push then pop separately.


Heap Sort in Python

Time: O(n log n)
Space: O(n) (extra list used)

def heapsort(arr):
    heapq.heapify(arr)
    result = []

    while arr:
        result.append(heapq.heappop(arr))

    return result

print(heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0]))

⚠️ Note:
In-place heap sort with O(1) space is possible but more complex to implement manually.


Max Heap in Python

Python does not directly support max-heap.
We simulate it by inserting negative values.

B = [-4, 3, 1, 0, 2, 5, 10, 8, 12, 9]

# Convert to negative
B = [-x for x in B]

heapq.heapify(B)

largest = -heapq.heappop(B)
print(largest)

# Insert 7
heapq.heappush(B, -7)

⚡ Interview Tip:
This technique is very commonly used in competitive programming.


Build Heap from Scratch (Using Insert)

Time: O(n log n)

C = [-5, 4, 2, 1, 7, 0, 3]
heap = []

for x in C:
    heapq.heappush(heap, x)
    print(heap)

Repeated insertion is slower than heapify.


Using Tuples in Heap

Heaps can store tuples.

Python compares tuples element by element.

Example: Frequency-based sorting.

from collections import Counter
import heapq

D = [5, 4, 3, 5, 4, 3, 5, 5, 4]
counter = Counter(D)

heap = []

for key, value in counter.items():
    heapq.heappush(heap, (value, key))

print(heap)

This is used in:

  • Top K Frequent Elements

  • Frequency Sorting

  • Scheduling Problems


Priority Queue vs Heap

A Priority Queue is an abstract data structure.

A Heap is one way to implement it efficiently.

Priority Queue ensures:

  • Highest (or lowest) priority element is removed first.

Median of Data Stream (Two Heaps Technique)

One of the most popular heap-based interview problems is:

Design a data structure that supports adding numbers and finding the median at any time.


Problem Statement

You continuously receive numbers from a data stream.

You must support two operations:

  1. addNum(num) → Add a number

  2. findMedian() → Return the median of all inserted numbers


Why Sorting Every Time Is Not Good?

If you sort the array every time:

  • Insertion → O(1)

  • Sorting → O(n log n)

  • Finding median → O(1)

This is inefficient for large streams.

We need something better.


Optimal Approach: Two Heaps

We use:

  • 🔹 Max Heap → Stores the smaller half of numbers

  • 🔹 Min Heap → Stores the larger half of numbers

Why Two Heaps?

We divide numbers into two balanced halves:

Max Heap (left side)  |  Min Heap (right side)
Smaller numbers       |  Larger numbers

This ensures:

  • Median is either:

    • Top of one heap

    • Or average of tops of both heaps


Important Rules

  1. Size difference between heaps should not exceed 1

  2. All elements in max heap ≤ elements in min heap


Python Implementation

Since Python only provides a Min Heap,

we simulate a Max Heap using negative values.

import heapq

class MedianFinder:

    def __init__(self):
        self.small = []  # Max heap (invert values)
        self.large = []  # Min heap

    def addNum(self, num):

        # Step 1: Push to max heap (invert value)
        heapq.heappush(self.small, -num)

        # Step 2: Ensure ordering property
        if self.small and self.large and (-self.small[0] > self.large[0]):
            value = -heapq.heappop(self.small)
            heapq.heappush(self.large, value)

        # Step 3: Balance sizes
        if len(self.small) > len(self.large) + 1:
            value = -heapq.heappop(self.small)
            heapq.heappush(self.large, value)

        if len(self.large) > len(self.small):
            value = heapq.heappop(self.large)
            heapq.heappush(self.small, -value)

    def findMedian(self):

        if len(self.small) > len(self.large):
            return -self.small[0]

        return (-self.small[0] + self.large[0]) / 2

Time Complexity

Operation Time Complexity
addNum() O(log n)
findMedian() O(1)

This is optimal.


Example

Input Stream:

1 → median = 1
1, 2 → median = 1.5
1, 2, 3 → median = 2
1, 2, 3, 4 → median = 2.5

Heaps internally balance automatically.