Design And Implement LRU Cache Problem

Design And Implement LRU Cache Problem Statement

Implement a Least Recently Used (LRU) cache and execute a given sequence of SET(key, value) and GET(key) operations. Return the results of all the GET operations from the sequence.

Initially the cache is empty. Cache capacity is a part of the input. Keys and values are all positive. GET(key) returns either the cached value or -1 (cache miss). SET(key, value) adds or updates the value for the key. If cache is at capacity and a new value needs to be added, remove the least recently used (LRU) value first.

Example

{
"capacity": 2,
"query_type": [1, 1, 0, 1, 0, 1, 0],
"key": [5, 10, 5, 15, 10, 5, 5],
"value": [11, 22, 1, 33, 1, 55, 1]
}

Output:

[11, -1, 55]

query_type of 0 means GET and query_type of 1 means SET.

Here are the operations from the input along with the return values and side effects of their execution:

Operation	Cache after	Returned value	Side effect
SET(5, 11)	[5 -> 11]		5 -> 11 added to cache
SET(10, 22)	[10 -> 22, 5 -> 11]		5 -> 11 became LRU
GET(5)	[5 -> 11, 10 -> 22]	11	10 -> 22 became LRU
SET(15, 33)	[15 -> 33, 5 -> 11]		10 -> 22 got evicted
GET(10)	[15 -> 33, 5 -> 11]	-1	Access order unchanged
SET(5, 55)	[5 -> 55, 15 -> 33]		Key 5 updated, became the most recently used (MRU)
GET(5)	[5 -> 55, 15 -> 33]	55	No change; key 5 already was the MRU

Notes

The function accepts four arguments:

The cache capacity,
query_type array with 0 for GET and 1 for SET operation,
key array with the keys for all the operations,
value array with the values for SET operations (value to be ignored for GETs).
The three input arrays all have the same length n and together they represent n operations.

Constraints:

1 <= capacity <= 10⁵
1 <= n <= 10⁵
1 <= any key <= 10⁵
1 <= any value <= 10⁵

To solve this problem we need to come up with a data structure that allows for efficient implementations of 1) lookup value by key 2) add/update value for a key and 3) evict the LRU element when the cache is at capacity and an element has to be added.

To evict elements correctly we always need to know which element is the least recently used (LRU), for that we somehow need to store the complete order in which the elements were accessed (GET or SET). When an element is looked up, added or updated, it becomes the most recently used (MRU), it would move to the front end of the order. When we need to evict, we’d evict the one at the back end.

We either need one structure to efficiently do both “looking up” and “ordering” or one for looking up and another for ordering.

Some data structures to consider:

Dynamic arrays (vector, ArrayList, etc) take O(n) time to look up and O(n) to change order (because to move an element from the middle to the front end, O(n) elements have to be shifted by one position). While O(n) is inefficient asymptotically, real world performance can be good for small enough n due to compactness of the data structure in memory that results in CPU cache friendliness (covered in the Search class).
Linked list takes O(n) time to look up but changing order can be implemented efficiently if somehow we don’t have to look up an element first: if we have a pointer to a node in the middle of a doubly linked list (as well as pointers to its head and tail) we can move that node from the middle to the front end of the list in constant time by rearranging pointers.
Min heap, no matter how we build it, takes O(n) to look up an element. But if we use the “last use timestamp” (and not the cache key) as the key to build the heap then changing order takes O(log n) and the LRU element is always at the root of the heap, easily accessible for eviction when necessary. Space efficiency and cache friendliness (covered in the Search class) of the array-based heap implementation can make its real world performance comparable with that of linked list or other less cache friendly structures despite the asymptotic complexity being O(log(n)) vs. O(1).
Hashmap takes O(1) to look up and doesn’t help in maintaining the order.
Tree based (ordered) map/dictionary may seem promising at first: one such map can do efficient O(log(n)) lookups while maintaining ordering. However in order to use such a map for cache value lookups, it must map the cache key->value — but ordering according to that key is meaningless to us. We only care about the “last use timestamp” ordering. We could use one ordered map for mapping (key->value) and another one for ordering (timestamp->key); then each operation would take O(log(n)) time. But then—why not just use hashmap and linked list instead (each operation O(1)), right?

We provided three solutions. Two use a hashmap + linked list (with two different implementations of the linked list) and one uses a hashmap + heap.

Design And Implement Lru Cache Solution 1: Hashmap Linked List

Code For Design And Implement Lru Cache Solution 1: Hashmap Linked List

/*
* Asymptotic complexity in terms of size of input arrays(number of operations) `n` and `capacity`:
* Time: O(n).
* Auxiliary space: O(min(capacity, n)).
* Total space: O(n).
*/

class LRU_cache
{
    int capacity;
    class ListNode
    {
    public:
        int key;
        int value;
        ListNode *prev;
        ListNode *next;

        ListNode(int _key = 0, int _value = 0)
        {
            key = _key;
            value = _value;
            prev = NULL;
            next = NULL;
        }
    };
    // Linked list whose nodes store {key, value}.
    ListNode *head = NULL, *tail = NULL;
    // key -> pointer-to-ListNode.
    unordered_map<int, ListNode *> key_to_actual_storage_mapping;

public:
    LRU_cache(int _capacity)
    {
        capacity = _capacity;
    }

    /*
    Add {key, value} to the front of the linked list.
    Also add the mapping (key -> pointer to where it is stored in linked list).
    That is, (key -> head of the linked list).
    */
    void add_to_front(int key, int value)
    {
        ListNode *temp = new ListNode(key, value);
        if (head == NULL)
        {
            // The list was empty; now it should have one element.
            head = temp;
            tail = temp;
        }
        else
        {
            temp->next = head;
            head->prev = temp;
            head = temp;
        }
        key_to_actual_storage_mapping[key] = head;
    }

    /*
    Remove one {key, value} from the tail of the linked list.
    Also remove the mapping (key -> pointer to where it is stored in the linked list).
    */
    void remove_least_recently_used()
    {
        int key = tail->key;
        key_to_actual_storage_mapping.erase(key);
        if (head == tail)
        {
            // The list had one element; let's make it an empty list.
            delete tail;
            head = tail = NULL;
        }
        else
        {
            tail = tail->prev;
            delete tail->next;
            tail->next = NULL;
        }
    }

    // Remove given node from the linked list.
    void erase_node(ListNode *cur_node)
    {
        ListNode *prev_node = cur_node->prev;
        ListNode *next_node = cur_node->next;
        // Connect previous node with next node.
        if (prev_node != NULL)
        {
            prev_node->next = next_node;
        }
        if (next_node != NULL)
        {
            next_node->prev = prev_node;
        }

        if (head == tail)
        {
            // One element list becomes empty.
            head = tail = NULL;
        }
        else if (head == cur_node)
        {
            head = next_node;
        }
        else if (tail == cur_node)
        {
            tail = prev_node;
        }

        delete cur_node;
    }

    // If a value for the given key is cached, makes it the MRU and returns it; else returns -1.
    int get(int key)
    {
        unordered_map<int, ListNode *>::iterator it = key_to_actual_storage_mapping.find(key);

        if (it == key_to_actual_storage_mapping.end())
        {
            // Key isn't found.
            return -1;
        }
        // it->second points to node in the linked list; let's get its value.
        int value = it->second->value;
        // Remove from the original position.
        erase_node(it->second);
        // Add to the front of the list.
        add_to_front(key, value);
        return value;
    }

    // Adds or updates the cached value for the given key.
    // Evicts the LRU cached value if necessary to avoid exceeding capacity of the cache.
    // Makes the newly added/updated cache entry the MRU one.
    void set(int key, int value)
    {
        unordered_map<int, ListNode*>::iterator it = key_to_actual_storage_mapping.find(key);
        if (it == key_to_actual_storage_mapping.end())
        {
            // The key isn't found in the cache; so it needs to be added.
            if (key_to_actual_storage_mapping.size() == capacity)
            {
                // Cache is at capacity. Let's evict the LRU element.
                remove_least_recently_used();
            }
            // Add the key->value to the cache.
            add_to_front(key, value);
        }
        else
        {
            // Key is found in the cache; so we need to update the value and make it the MRU.
            // Remove it from the original position in the list.
            erase_node(it->second);
            // Add to the front of the list and to the hashmap.
            add_to_front(key, value);
        }
    }
}; // class LRU_cache

vector<int> implement_lru_cache(int capacity, vector<int> &query_type, vector<int> &key, vector<int> &value)
{
    int n = query_type.size();
    LRU_cache* cache = new LRU_cache(capacity);
    vector<int> ans;
    for (int i = 0; i < n; i++)
    {
        if (query_type[i] == 0)
        {
            ans.push_back(cache->get(key[i]));
        }
        else
        {
            cache->set(key[i], value[i]);
        }
    }
    return ans;
}

Design And Implement Lru Cache Solution 2: Hashmap Std List

These solutions use a hashmap for efficient cache lookups and a doubly linked list to keep track of the access order. One implements the doubly linked list from scratch and the other uses the C++ std::list implementation. In both implementations the linked list simply stores the {key, value} pairs ordered from the MRU to the LRU. The hashmap maps the cache keys to the pointers to linked list nodes.

SET(key, value) operation:

Look up the key in the hashmap.
If found, we have a pointer to the linked list node.
- Update the value in that node.
- Then move the node from its current position in the list to the front, so it becomes the MRU.
If not found:
- If (and only if) the cache is at capacity, access the tail of the linked list (the LRU element) and delete it both from the list and from the hashmap.
- Then add the {key, value} to the front of the list and to the hashmap.

GET(key) operation:

Look up the key in the hashmap.
If not found, return -1.
If found, move the linked list node to the front (the hashmap still has the pointer to that node object; nothing needs to change there). Return the value found in the node.

Time Complexity

O(n).

Each operation, GET or SET, takes constant time, and there are n operations.

Auxiliary Space Used

O(min(capacity, number of SET operations)) = O(min(capacity, n)).

Each SET operation increases the size of the hashmap and the list, until we are at capacity. On the other hand, if capacity > n, the capacity won’t ever be reached as we only allocate space for actually added cache entries.

Space Complexity

O(n).

Input takes O(n) space.

Auxiliary space used is O(min(capacity, n)).

Output takes O(number of GET operations) = O(n) space.

Summing up the three we get the total space complexity: O(n) + O(min(capacity, n)) + O(n) = O(n).

Code For Design And Implement Lru Cache Solution 2: Hashmap Std List

/*
* Asymptotic complexity in terms of size of input arrays(number of operations) `n` and `capacity`:
* Time: O(n).
* Auxiliary space: O(min(capacity, n)).
* Total space: O(n).
*/

class LRU_cache
{
    int capacity;

    // Doubly linked list that stores {key, value} pairs.
    list<pair<int, int>> actual_storage;

    // key -> pointer to the linked list node.
    unordered_map<int, list<pair<int, int>>::iterator> key_to_actual_storage_mapping;

public:

    LRU_cache(int _capacity)
    {
        capacity = _capacity;
    }

    /*
    Add {key, value} to the front of the linked list.
    Also add the mapping (key -> pointer to where it is stored in linked list).
    That is, (key -> head of the linked list).
    */
    void add_to_front(int key, int value)
    {
        actual_storage.push_front({key, value});
        key_to_actual_storage_mapping[key] = actual_storage.begin();
    }

    /*
    Remove one {key, value} from the tail of the linked list.
    Also remove the mapping (key -> pointer to where it is stored in the linked list).
    */
    void remove_least_recently_used()
    {
        int key = actual_storage.back().first;
        key_to_actual_storage_mapping.erase(key);
        actual_storage.pop_back();
    }

    // If a value for the given key is cached, makes it the MRU and returns it; else returns -1.
    int get(int key)
    {
        unordered_map<int, list<pair<int, int>>::iterator>::iterator it =
            key_to_actual_storage_mapping.find(key);

        if (it == key_to_actual_storage_mapping.end())
        {
            // Key isn't found.
            return -1;
        }

        // it->second points to node in the linked list; let's get its value.
        int value = it->second->second;
        // Remove from the original position.
        actual_storage.erase(it->second);
        // Add to the front of the list.
        add_to_front(key, value);
        return value;
    }

    // Adds or updates the cached value for the given key.
    // Evicts the LRU cached value if necessary to avoid exceeding capacity of the cache.
    // Makes the newly added/updated cache entry the MRU one.
    void set(int key, int value)
    {
        unordered_map<int, list<pair<int, int>>::iterator>::iterator it =
            key_to_actual_storage_mapping.find(key);

        if (it == key_to_actual_storage_mapping.end())
        {
            // The key isn't found in the cache; so it needs to be added.
            if (key_to_actual_storage_mapping.size() == capacity)
            {
                // Cache is at capacity. Let's evict the LRU element.
                remove_least_recently_used();
            }
            // Add the key->value to the cache.
            add_to_front(key, value);
        }
        else
        {
            // Key is found in the cache; so we need to update the value and make it the MRU.
            // Remove it from the original position in the list.
            actual_storage.erase(it->second);
            // Add to the front of the list and to the hashmap.
            add_to_front(key, value);
        }
    }
}; // class LRU_cache

vector<int> implement_lru_cache(int capacity, vector<int> &query_type, vector<int> &key, vector<int> &value)
{
    int n = query_type.size();
    // Setup cache.
    LRU_cache* cache = new LRU_cache(capacity);
    vector<int> ans;
    for (int i = 0; i < n; i++)
    {
        if (query_type[i] == 0)
        {
            ans.push_back(cache->get(key[i]));
        }
        else
        {
            cache->set(key[i], value[i]);
        }
    }
    return ans;
}

Design And Implement Lru Cache Solution 3: Hashmap Max Heap

This solution also uses a hashmap for efficient cache lookups. For keeping track of the access order we use a simple integer counter as a timestamp; we build and maintain a heap with that timestamp as a key. The C++ STL heap implementation (std::push_heap, std::pop_heap) is a max heap by default and we need the LRU element to be at the root of the heap; to achieve that we decrement the timestamp for every GET or SET operation instead of incrementing it. That way the LRU element has the greatest timestamp of all, so it ends up at the root of the max heap.

The heap stores {timestamp, key} pairs and the hashmap stores (key -> {timestamp, value}).

Time Complexity

O(n * log(capacity)).

Per one GET/SET operation, this solution executes a constant number of insert/delete operations on the max heap (O(log(capacity)) each) and a constant number of lookup/insert/update/delete operations on the hashmap (constant time each). Given that we have a total of n GET/SET operations, overall time complexity is n * O(log(capacity) = O(n * log(capacity).

Here is one example that may help understand this solution and its time complexity. It is the worst case for n = 15:

capacity = 7
query_type = [1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1]
key = [1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 8]
value = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

First 7 SET operations add 7 key-value pairs to the cache. After that the cache is at capacity.

Next 7 GET operations access every cache entry, so the timestamps in the hashmap (but not in the heap) change.

Last operation, SET(8, 15), adds a new key-value; it needs to evict the LRU element from the cache. The while(true) loop will run 7 times, each run taking log(capacity) = log(7) time.

Auxiliary Space Used

O(min(capacity, number of SET operations)) = O(min(capacity, n)).

Each SET operation increases the size of the hashmap and the heap, until we are at capacity. On the other hand, if capacity > n, the capacity won’t ever be reached as we only allocate space for actually added cache entries.

Space Complexity

O(n).

Input takes O(n) space.

Auxiliary space used is O(min(capacity, n)).

Output takes O(number of GET operations) = O(n) space.

Summing up the three we get the total space complexity: O(n) + O(min(capacity, n)) + O(n) = O(n).

Note on the real world performance

As mentioned above, the heap based solution will have running time comparable to the linked list based solution on the modern CPUs despite having the asymptotic time complexity of O(n * log(capacity)) vs. O(n). That is especially true for small enough capacity and n, such as <= 10⁵. The reason is that vector (the underlying data structure we use to store the heap) is stored in contiguous memory and therefore takes much better advantage of the processor caching than linked list does: nodes of a linked list are likely to be stored in memory cells that are much further apart. That is covered in the Search class in more detail. With big enough capacity and n, however, the linked list based solution is guaranteed to run faster.

Code For Design And Implement Lru Cache Solution 3: Hashmap Max Heap

/*
* Asymptotic complexity in terms of size of input arrays(number of operations) `n` and `capacity`:
* Time: O(n * log(capacity)).
* Auxiliary space: O(min(capacity, n)).
* Total space: O(n).
*/

class LRU_cache
{
    // Maximum size of the cache.
    int capacity = 0;
    int timestamp = 0;
    // Heap storing {timestamp, key} pairs. timestamp being the key for building the heap.
    vector<pair<int, int>> lru_heap;
    // hashmap {key -> {timestamp, value}}.
    unordered_map<int, pair<int, int>> cache;

public:

    LRU_cache(int _capacity)
    {
        capacity = _capacity;
    }

    // If a value for the given key is cached, makes it the MRU and returns it; else returns -1.
    int get(int key)
    {
        if(cache.find(key) != cache.end())
        {
            // Key is found in the cache; let's update its access timestamp and keep the value unchanged.
            cache[key].first = timestamp--;
            // We don't update the timestamp of this element in the heap, allowing it become outdated.
            // We will take care of that when (and if) we need to evict the LRU element.
            return cache[key].second; // Return the cached value.
        }
        else
        {
            return -1;
        }
    }

    // Adds or updates the cached value for the given key.
    // Evicts the LRU cached value if necessary to avoid exceeding capacity of the cache.
    // Makes the newly added/updated cache entry the MRU one.
    void set(int key, int value)
    {
        if(cache.find(key) != cache.end())
        {
            // Key is found in the cache; let's update the hashmap with the new value and timestamp.
            cache[key] = {timestamp--, value};
            // We don't update the timestamp of this element in the heap, allowing it become outdated.
            // We will take care of that when (and if) we need to evict the LRU element.
            return;
        }
        // We need to add an element. If the cache is full, we need to evict the LRU element.
        if(lru_heap.size() == capacity)
        {
            // Loop until we have removed _one_ element from cache.
            while(true)
            {
                // Root of the heap is the candidate element for eviction because it
                // is the LRU according to the timestamps of the heap.

                // Remove the root element from the heap.
                pop_heap(lru_heap.begin(), lru_heap.end());
                // The removed element is at the back of vector now:
                auto& candidate = lru_heap.back();
                // Check if the timestamp of the element in the heap matches the timestamp
                // of the same element in the hashmap.
                if(cache[candidate.second].first == candidate.first)
                {
                    // The timestamp in the heap is up-to-date -> this is really the LRU element.
                    // Remove it from the hashmap.
                    cache.erase(candidate.second);
                    // Complete removing it from the heap by removing it from the underlying vector.
                    lru_heap.pop_back();
                    break; // Exit the while(true) loop.
                }
                else
                {
                    // The candidate element was used later than the heap "knows",
                    // so it might not be the LRU. Let us push it back to the heap
                    // now with the up-to-date timestamp (from the hashmap).
                    candidate.first = cache[candidate.second].first;
                    push_heap(lru_heap.begin(), lru_heap.end());
                    // Next iteration of the while(true) loop will try the new root of the heap.
                }
            }
        }
        // Add the new key-value pair into the hashmap and the heap with the appropriate timestamp.
        cache[key] = {timestamp, value};
        lru_heap.push_back({timestamp, key});
        push_heap(lru_heap.begin(), lru_heap.end());
        timestamp--;
    }
}; // class LRU_cache

vector<int> implement_lru_cache(int capacity, vector<int> &query_type, vector<int> &key, vector<int> &value)
{
    int n = query_type.size();
    LRU_cache* cache = new LRU_cache(capacity);
    vector<int> ans;
    for (int i = 0; i < n; i++)
    {
        if (query_type[i] == 0)
        {
            ans.push_back(cache->get(key[i]));
        }
        else
        {
            cache->set(key[i], value[i]);
        }
    }
    return ans;
}

We hope that these solutions to lru cache problem have helped you level up your coding skills. You can expect problems like these at top tech companies like Amazon and Google.

If you are preparing for a tech interview at FAANG or any other Tier-1 tech company, register for Interview Kickstart’s FREE webinar to understand the best way to prepare.

Interview Kickstart offers interview preparation courses taught by FAANG+ tech leads and seasoned hiring managers. Our programs include a comprehensive curriculum, unmatched teaching methods, and career coaching to help you nail your next tech interview.

We offer 18 interview preparation courses, each tailored to a specific engineering domain or role, including the most in-demand and highest-paying domains and roles, such as:

‍To learn more, register for the FREE webinar.

Design And Implement LRU Cache Problem