Search topics...
Coding
4 questions

Data Structure Interview Problems — LRU, LFU, Median

Data structure problems, sorting algorithms reference, LRU/LFU Cache, Find Median from Data Stream, hash tables, and trees.

Data structure design problems test your ability to compose primitives (hash maps, linked lists, heaps) into efficient systems — exactly what you do when building caches, memory allocators, and schedulers in embedded firmware.

Sorting Algorithms Reference

SortBestWorstAverageSpaceNotes
Bubble SortO(n²)O(n²)O(n²)O(1)Completes some pre-sorting
Adaptive BubbleO(n)O(n²)O(n²)O(1)Early exit on sorted input
Selection SortO(n²)O(n²)O(n²)O(1)Good for small arrays
Insertion SortO(n)O(n²)O(n²)O(1)Good on partially sorted data
Bucket SortO(n+k)O(n²)O(n+k)O(n+k)Good for uniform distributions
Quick SortO(n log n)O(n²)O(n log n)O(log n)Fastest average; use insertion sort for small partitions
Merge SortO(n log n)O(n log n)O(n log n)O(n)Stable; preferred for linked lists
Heap SortO(n log n)O(n log n)O(n log n)O(1)In-place; fixdown/fixup O(log n), build heap O(n)

Problems

#ProblemDifficultyFrequency
1LRU CacheMedium
2Hash string with collision resolution (linked list chaining, thread safety)Medium
3Custom malloc/free using linked listsMedium
4Binary tree traversal (in-order, pre-order, post-order)Easy
5Insert Delete GetRandom O(1)Medium
6Design Add and Search Words Data StructureMedium
7Find Median from Data StreamHard
8Dictionary of strings in C (store and search)Medium
9Binary search tree operations (O(n) worst, O(log n) average)Medium
10LFU CacheHard

Implementation

LRU Cache Medium

Problem: Design a Least Recently Used cache supporting get and put operations in O(1) time.

▶ Practice this problem

QSolution — O(1) time, O(n) space

Key Insight: Combine a hash map for O(1) key lookup with a doubly linked list for O(1) recency tracking — the map finds nodes instantly, and the list maintains access order without shifting elements.

Thought Process:

  1. A naive approach uses a list and linear search — O(n) per get.
  2. Adding a hash map from key to value gives O(1) lookup, but evicting the LRU item still requires scanning for the oldest entry.
  3. A doubly linked list ordered by access time lets you move-to-back on access and evict from front in O(1) — but finding a node to move requires O(n) traversal.
  4. The optimal solution stores (key -> (value, list iterator)) in the hash map, enabling O(1) lookup AND O(1) list manipulation.

Approach: On every access, move the key to the back of the list (most recently used). On eviction, remove from the front (least recently used). The hash map stores (key -> (value, list iterator)) for O(1) list manipulation.

cpp
class LRUCache {
unordered_map<int, pair<int, list<int>::iterator>> cache;
list<int> table;
int cap;
void move_to_back(int key) {
table.erase(cache[key].second);
table.push_back(key);
cache[key].second = prev(table.end());
}
public:
LRUCache(int capacity) : cap(capacity) {}
int get(int key) {
if (cache.find(key) == cache.end()) return -1;
move_to_back(key);
return cache[key].first;
}
void put(int key, int value) {
if (cache.find(key) != cache.end()) {
cache[key].first = value;
move_to_back(key);
} else {
if ((int)cache.size() == cap) {
int lru = table.front();
table.pop_front();
cache.erase(lru);
}
table.push_back(key);
cache[key] = {value, prev(table.end())};
}
}
};

Time Complexity: O(1) per get/put — hash map lookup, list splice, and list erase are all constant time with iterator access.

Space Complexity: O(n) where n is the cache capacity — each entry stores a key-value pair in the map plus a node in the doubly linked list.

Embedded relevance: LRU caches are used in embedded systems for DNS resolution caches, sensor data caches, and flash page mapping tables. Understanding this design helps you build bounded-memory caches without heap fragmentation.


Insert Delete GetRandom O(1) Medium

Problem: Design a data structure that supports insert, delete, and getRandom operations, each in average O(1) time.

▶ Practice this problem

QSolution — O(1) time, O(n) space

Key Insight: A hash map gives O(1) insert/delete but no random access; a vector gives O(1) random access but O(n) deletion. Combining both and swapping the target with the last element before popping gives O(1) for all three operations.

Thought Process:

  1. A hash set supports O(1) insert and delete, but getRandom requires converting to a list first — O(n).
  2. A vector supports O(1) getRandom via index, but delete at an arbitrary position is O(n) due to shifting.
  3. The trick: swap the element to delete with the last element, update the map, and pop the back — O(1) deletion without shifting.
  4. Combine vector (for random access) + hash map (value -> index) for O(1) on all three operations.

Approach: Use a vector for O(1) random access and a hash map for O(1) lookup by value. The trick for O(1) deletion: swap the element to delete with the last element in the vector, update the hash map, then pop the back. This avoids shifting elements.

cpp
class RandomizedSet {
public:
unordered_map<int, int> mp;
vector<int> vc;
int n;
RandomizedSet() {
srand(time(NULL));
n = 0;
}
bool insert(int val) {
if (mp.find(val) == mp.end()) {
if (n < vc.size()) {
vc[n] = val;
} else {
vc.push_back(val);
}
mp[val] = n;
n++;
return true;
}
return false;
}
bool remove(int val) {
if (mp.find(val) != mp.end()) {
vc[mp[val]] = vc[n - 1];
mp[vc[n - 1]] = mp[val];
n--;
mp.erase(val);
return true;
}
return false;
}
int getRandom() {
if (n < 1) return 0;
return vc[rand() % n];
}
};

Time Complexity: O(1) average per operation — hash map operations are amortized O(1), vector push_back/pop_back are amortized O(1), and swap is O(1).

Space Complexity: O(n) where n is the number of stored elements — one entry in both the vector and the hash map per element.


Find Median from Data Stream Hard

Problem: Design a data structure that supports adding integers from a stream and finding the median of all elements.

▶ Practice this problem

QSolution — O(log n) add, O(1) median

Key Insight: Split elements into two halves — a max-heap for the lower half and a min-heap for the upper half. The median is always at the boundary between the two heaps, accessible in O(1) from their tops.

Thought Process:

  1. Brute force: keep a sorted array, insert with binary search — O(n) per add due to shifting, O(1) median.
  2. A balanced BST (like std::set) gives O(log n) insert, but finding the median requires maintaining an iterator — tricky with duplicates.
  3. Two heaps insight: if you keep the lower half in a max-heap and the upper half in a min-heap, and keep them balanced (sizes differ by at most 1), the median is always at one or both tops.
  4. Rebalancing after each insert is just one heap push/pop — O(log n).

Approach: Maintain two heaps: a max-heap for the lower half and a min-heap for the upper half. When adding a number, push to the max-heap first, then rebalance by moving the max-heap's top to the min-heap. If the min-heap becomes larger, move its top back. The median is either the max-heap's top (odd count) or the average of both tops (even count).

cpp
class MedianFinder {
priority_queue<int> lo; // max heap (lower half)
priority_queue<int, vector<int>, greater<int>> hi; // min heap (upper half)
public:
void addNum(int num) {
lo.push(num);
hi.push(lo.top());
lo.pop();
if (lo.size() < hi.size()) {
lo.push(hi.top());
hi.pop();
}
}
double findMedian() {
return lo.size() > hi.size() ? lo.top() : ((double)lo.top() + hi.top()) * 0.5;
}
};

Time Complexity: O(log n) per addNum — each call performs at most 3 heap operations (push/pop), each O(log n). O(1) per findMedian — just reads heap tops.

Space Complexity: O(n) — all elements are stored across the two heaps.

Embedded relevance: The two-heap technique is used in real-time sensor systems for computing running medians — a robust alternative to moving averages that's immune to outlier spikes from noisy ADC readings.


LFU Cache Hard

Problem: Design a Least Frequently Used cache supporting get and put operations in O(1) time.

▶ Practice this problem

QSolution — O(1) time, O(n) space

Key Insight: Use three coordinated hash maps — one for values, one for frequency counts, and one mapping each frequency to an ordered list of keys — plus a single integer tracking the minimum frequency. This gives O(1) access, frequency update, and eviction.

Thought Process:

  1. A min-heap keyed by frequency gives O(log n) eviction but O(log n) frequency updates too.
  2. A single hash map with frequency tracking can find the LFU key, but scanning for the minimum frequency is O(n).
  3. Maintaining a frequency -> key list map with a minFrequency counter avoids scanning — evict from map[minFrequency].
  4. On access, move the key from map[freq] to map[freq+1]; if map[minFrequency] becomes empty, increment minFrequency. All O(1) with list iterators.

Approach: Use three hash maps: (1) key -> value + list iterator, (2) key -> frequency count, (3) frequency -> ordered list of keys. Track the lowest frequency. On access, move the key from its current frequency list to the next. On eviction, remove the least-recently-used key from the lowest-frequency list. This triple-map design achieves O(1) for all operations.

cpp
typedef int Key_t;
typedef int Count_t;
struct Node {
int value;
list<Key_t>::iterator itr;
};
class LFUCache {
unordered_map<Key_t, Node> m_values;
unordered_map<Key_t, Count_t> m_counts;
unordered_map<Count_t, list<Key_t>> m_countKeyMap;
int m_lowestFrequency;
int m_maxCapacity;
public:
LFUCache(int capacity) {
m_maxCapacity = capacity;
m_lowestFrequency = 0;
}
int get(int key) {
if (m_values.find(key) == m_values.end() || m_maxCapacity <= 0)
return -1;
put(key, m_values[key].value);
return m_values[key].value;
}
void put(int key, int value) {
if (m_maxCapacity <= 0) return;
if (m_values.find(key) == m_values.end()) {
if (m_values.size() == m_maxCapacity) {
int keyToDelete = m_countKeyMap[m_lowestFrequency].back();
m_countKeyMap[m_lowestFrequency].pop_back();
if (m_countKeyMap[m_lowestFrequency].empty())
m_countKeyMap.erase(m_lowestFrequency);
m_values.erase(keyToDelete);
m_counts.erase(keyToDelete);
}
m_values[key].value = value;
m_counts[key] = 0;
m_lowestFrequency = 0;
m_countKeyMap[0].push_front(key);
m_values[key].itr = m_countKeyMap[0].begin();
} else {
m_countKeyMap[m_counts[key]].erase(m_values[key].itr);
if (m_countKeyMap[m_counts[key]].empty()) {
if (m_lowestFrequency == m_counts[key]) m_lowestFrequency++;
m_countKeyMap.erase(m_counts[key]);
}
m_values[key].value = value;
m_counts[key]++;
m_countKeyMap[m_counts[key]].push_front(key);
m_values[key].itr = m_countKeyMap[m_counts[key]].begin();
}
}
};

Time Complexity: O(1) per get/put — all operations use hash map lookups and linked list splicing with stored iterators, each constant time.

Space Complexity: O(n) where n is the cache capacity — three hash maps and frequency lists, but total entries are bounded by capacity.

Was this helpful?