Data Structure Interview Problems — LRU, LFU, Median

Data structure design problems test your ability to compose primitives (hash maps, linked lists, heaps) into efficient systems — exactly what you do when building caches, memory allocators, and schedulers in embedded firmware.

Sorting Algorithms Reference

Sort	Best	Worst	Average	Space	Notes
Bubble Sort	O(n²)	O(n²)	O(n²)	O(1)	Completes some pre-sorting
Adaptive Bubble	O(n)	O(n²)	O(n²)	O(1)	Early exit on sorted input
Selection Sort	O(n²)	O(n²)	O(n²)	O(1)	Good for small arrays
Insertion Sort	O(n)	O(n²)	O(n²)	O(1)	Good on partially sorted data
Bucket Sort	O(n+k)	O(n²)	O(n+k)	O(n+k)	Good for uniform distributions
Quick Sort	O(n log n)	O(n²)	O(n log n)	O(log n)	Fastest average; use insertion sort for small partitions
Merge Sort	O(n log n)	O(n log n)	O(n log n)	O(n)	Stable; preferred for linked lists
Heap Sort	O(n log n)	O(n log n)	O(n log n)	O(1)	In-place; fixdown/fixup O(log n), build heap O(n)

Problems

#	Problem	Difficulty	Frequency
1	LRU Cache	Medium	★★★★★
2	Hash string with collision resolution (linked list chaining, thread safety)	Medium	★★★★★
3	Custom malloc/free using linked lists	Medium	★★★★★
4	Binary tree traversal (in-order, pre-order, post-order)	Easy	★★★★★
5	Insert Delete GetRandom O(1)	Medium	★★★★★
6	Design Add and Search Words Data Structure	Medium	★★★★★
7	Find Median from Data Stream	Hard	★★★★★
8	Dictionary of strings in C (store and search)	Medium	★★★★★
9	Binary search tree operations (O(n) worst, O(log n) average)	Medium	★★★★★
10	LFU Cache	Hard	★★★★★

Implementation

LRU Cache Medium ★★★★★

Problem: Design a Least Recently Used cache supporting get and put operations in O(1) time.

▶ Practice this problem

QSolution — O(1) time, O(n) space

Key Insight: Combine a hash map for O(1) key lookup with a doubly linked list for O(1) recency tracking — the map finds nodes instantly, and the list maintains access order without shifting elements.

Thought Process:

A naive approach uses a list and linear search — O(n) per get.
Adding a hash map from key to value gives O(1) lookup, but evicting the LRU item still requires scanning for the oldest entry.
A doubly linked list ordered by access time lets you move-to-back on access and evict from front in O(1) — but finding a node to move requires O(n) traversal.
The optimal solution stores (key -> (value, list iterator)) in the hash map, enabling O(1) lookup AND O(1) list manipulation.

Approach: On every access, move the key to the back of the list (most recently used). On eviction, remove from the front (least recently used). The hash map stores (key -> (value, list iterator)) for O(1) list manipulation.

cpp

class LRUCache {
    unordered_map<int, pair<int, list<int>::iterator>> cache;
    list<int> table;
    int cap;

    void move_to_back(int key) {
        table.erase(cache[key].second);
        table.push_back(key);
        cache[key].second = prev(table.end());
    }

public:
    LRUCache(int capacity) : cap(capacity) {}

    int get(int key) {
        if (cache.find(key) == cache.end()) return -1;
        move_to_back(key);
        return cache[key].first;
    }

    void put(int key, int value) {
        if (cache.find(key) != cache.end()) {
            cache[key].first = value;
            move_to_back(key);
        } else {
            if ((int)cache.size() == cap) {
                int lru = table.front();
                table.pop_front();
                cache.erase(lru);
            }
            table.push_back(key);
            cache[key] = {value, prev(table.end())};
        }
    }
};

Time Complexity: O(1) per get/put — hash map lookup, list splice, and list erase are all constant time with iterator access.

Space Complexity: O(n) where n is the cache capacity — each entry stores a key-value pair in the map plus a node in the doubly linked list.

Embedded relevance: LRU caches are used in embedded systems for DNS resolution caches, sensor data caches, and flash page mapping tables. Understanding this design helps you build bounded-memory caches without heap fragmentation.

Insert Delete GetRandom O(1) Medium ★★★★★

Problem: Design a data structure that supports insert, delete, and getRandom operations, each in average O(1) time.

▶ Practice this problem

QSolution — O(1) time, O(n) space

Key Insight: A hash map gives O(1) insert/delete but no random access; a vector gives O(1) random access but O(n) deletion. Combining both and swapping the target with the last element before popping gives O(1) for all three operations.

Thought Process:

A hash set supports O(1) insert and delete, but getRandom requires converting to a list first — O(n).
A vector supports O(1) getRandom via index, but delete at an arbitrary position is O(n) due to shifting.
The trick: swap the element to delete with the last element, update the map, and pop the back — O(1) deletion without shifting.
Combine vector (for random access) + hash map (value -> index) for O(1) on all three operations.

Approach: Use a vector for O(1) random access and a hash map for O(1) lookup by value. The trick for O(1) deletion: swap the element to delete with the last element in the vector, update the hash map, then pop the back. This avoids shifting elements.

cpp

class RandomizedSet {
public:
    unordered_map<int, int> mp;
    vector<int> vc;
    int n;

    RandomizedSet() {
        srand(time(NULL));
        n = 0;
    }

    bool insert(int val) {
        if (mp.find(val) == mp.end()) {
            if (n < vc.size()) {
                vc[n] = val;
            } else {
                vc.push_back(val);
            }
            mp[val] = n;
            n++;
            return true;
        }
        return false;
    }

    bool remove(int val) {
        if (mp.find(val) != mp.end()) {
            vc[mp[val]] = vc[n - 1];
            mp[vc[n - 1]] = mp[val];
            n--;
            mp.erase(val);
            return true;
        }
        return false;
    }

    int getRandom() {
        if (n < 1) return 0;
        return vc[rand() % n];
    }
};

Time Complexity: O(1) average per operation — hash map operations are amortized O(1), vector push_back/pop_back are amortized O(1), and swap is O(1).

Space Complexity: O(n) where n is the number of stored elements — one entry in both the vector and the hash map per element.

Find Median from Data Stream Hard ★★★★★

Problem: Design a data structure that supports adding integers from a stream and finding the median of all elements.

▶ Practice this problem

QSolution — O(log n) add, O(1) median

Key Insight: Split elements into two halves — a max-heap for the lower half and a min-heap for the upper half. The median is always at the boundary between the two heaps, accessible in O(1) from their tops.

Thought Process:

Brute force: keep a sorted array, insert with binary search — O(n) per add due to shifting, O(1) median.
A balanced BST (like std::set) gives O(log n) insert, but finding the median requires maintaining an iterator — tricky with duplicates.
Two heaps insight: if you keep the lower half in a max-heap and the upper half in a min-heap, and keep them balanced (sizes differ by at most 1), the median is always at one or both tops.
Rebalancing after each insert is just one heap push/pop — O(log n).

Approach: Maintain two heaps: a max-heap for the lower half and a min-heap for the upper half. When adding a number, push to the max-heap first, then rebalance by moving the max-heap's top to the min-heap. If the min-heap becomes larger, move its top back. The median is either the max-heap's top (odd count) or the average of both tops (even count).

cpp

class MedianFinder {
    priority_queue<int> lo;                              // max heap (lower half)
    priority_queue<int, vector<int>, greater<int>> hi;   // min heap (upper half)

public:
    void addNum(int num) {
        lo.push(num);
        hi.push(lo.top());
        lo.pop();

        if (lo.size() < hi.size()) {
            lo.push(hi.top());
            hi.pop();
        }
    }

    double findMedian() {
        return lo.size() > hi.size() ? lo.top() : ((double)lo.top() + hi.top()) * 0.5;
    }
};

Time Complexity: O(log n) per addNum — each call performs at most 3 heap operations (push/pop), each O(log n). O(1) per findMedian — just reads heap tops.

Space Complexity: O(n) — all elements are stored across the two heaps.

Embedded relevance: The two-heap technique is used in real-time sensor systems for computing running medians — a robust alternative to moving averages that's immune to outlier spikes from noisy ADC readings.

LFU Cache Hard ★★★★★

Problem: Design a Least Frequently Used cache supporting get and put operations in O(1) time.

▶ Practice this problem

QSolution — O(1) time, O(n) space

Key Insight: Use three coordinated hash maps — one for values, one for frequency counts, and one mapping each frequency to an ordered list of keys — plus a single integer tracking the minimum frequency. This gives O(1) access, frequency update, and eviction.

Thought Process:

A min-heap keyed by frequency gives O(log n) eviction but O(log n) frequency updates too.
A single hash map with frequency tracking can find the LFU key, but scanning for the minimum frequency is O(n).
Maintaining a frequency -> key list map with a minFrequency counter avoids scanning — evict from map[minFrequency].
On access, move the key from map[freq] to map[freq+1]; if map[minFrequency] becomes empty, increment minFrequency. All O(1) with list iterators.

Approach: Use three hash maps: (1) key -> value + list iterator, (2) key -> frequency count, (3) frequency -> ordered list of keys. Track the lowest frequency. On access, move the key from its current frequency list to the next. On eviction, remove the least-recently-used key from the lowest-frequency list. This triple-map design achieves O(1) for all operations.

cpp

typedef int Key_t;
typedef int Count_t;

struct Node {
    int value;
    list<Key_t>::iterator itr;
};

class LFUCache {
    unordered_map<Key_t, Node> m_values;
    unordered_map<Key_t, Count_t> m_counts;
    unordered_map<Count_t, list<Key_t>> m_countKeyMap;
    int m_lowestFrequency;
    int m_maxCapacity;

public:
    LFUCache(int capacity) {
        m_maxCapacity = capacity;
        m_lowestFrequency = 0;
    }

    int get(int key) {
        if (m_values.find(key) == m_values.end() || m_maxCapacity <= 0)
            return -1;
        put(key, m_values[key].value);
        return m_values[key].value;
    }

    void put(int key, int value) {
        if (m_maxCapacity <= 0) return;
        if (m_values.find(key) == m_values.end()) {
            if (m_values.size() == m_maxCapacity) {
                int keyToDelete = m_countKeyMap[m_lowestFrequency].back();
                m_countKeyMap[m_lowestFrequency].pop_back();
                if (m_countKeyMap[m_lowestFrequency].empty())
                    m_countKeyMap.erase(m_lowestFrequency);
                m_values.erase(keyToDelete);
                m_counts.erase(keyToDelete);
            }
            m_values[key].value = value;
            m_counts[key] = 0;
            m_lowestFrequency = 0;
            m_countKeyMap[0].push_front(key);
            m_values[key].itr = m_countKeyMap[0].begin();
        } else {
            m_countKeyMap[m_counts[key]].erase(m_values[key].itr);
            if (m_countKeyMap[m_counts[key]].empty()) {
                if (m_lowestFrequency == m_counts[key]) m_lowestFrequency++;
                m_countKeyMap.erase(m_counts[key]);
            }
            m_values[key].value = value;
            m_counts[key]++;
            m_countKeyMap[m_counts[key]].push_front(key);
            m_values[key].itr = m_countKeyMap[m_counts[key]].begin();
        }
    }
};

Time Complexity: O(1) per get/put — all operations use hash map lookups and linked list splicing with stored iterators, each constant time.

Space Complexity: O(n) where n is the cache capacity — three hash maps and frequency lists, but total entries are bounded by capacity.