Why is fast lookup possible for dict.items()?

Question

Why is fast lookup possible for dict.items()?

74 Views Asked by Rajdeep Sindhu At 17 March 2024 at 16:53

Suppose we have a dictionary d defined as follows

d = {i:i for i in range(1,1000001)}

And then we store d.items() in x. So, x is a collection of tuples that contain key-value pairs for each element of d. I made a list with the same tuples, defined as follows :

l = [(i, i) for i in range(1,1000001)]

Now, I compared the time taken to execute (1000001, 1000001) in x and (1000001, 1000001) in l. The time difference was very significant. I assumed that this was because x had a set-like behavior and implemented hash tables on tuples, as tuples are hashable as well.

However, tuples that contain lists are non hashable, so I tried the same thing with this dictionary :

d = {i:[i,i+1] for i in range(1,1000001)}

Unlike what I expected, it still enabled fast lookup.

How does this work?

PS : Here's my code for time comparison

import time

def for_dict_items(n):
    #d = {i:i for i in range(1, n+1)}
    d = {i:[i, i+1] for i in range(1, n+1)}
    di = d.items()
    st = time.time()
    x = (n, [n, n+1]) in di
    et = time.time()
    return (et - st)


def for_tuples_list(n):
    #l = [(i, i) for i in range(1, n+1)]
    l = [(i,[i, i+1]) for i in range(1, n+1)]
    st = time.time()
    x = (n, [n, n+1]) in l
    et = time.time()
    return (et - st)


k = 1000000

t1 = for_dict_items(k)
t2 = for_tuples_list(k)

print(t1, t2, t2/t1, sep = "\n")

Original Q&A

There are 1 best solutions below

**STerliakov** · Accepted Answer · 2024-03-17T17:21:22.503000

Since dict is one of the most important core objects, it's implemented in C, as well as dict_values, dict_keys and dict_items. Here's the CPython implementation of dictitems_contains:

static int
dictitems_contains(PyObject *self, PyObject *obj)
{
    _PyDictViewObject *dv = (_PyDictViewObject *)self;
    int result;
    PyObject *key, *value, *found;
    if (dv->dv_dict == NULL)
        return 0;
    if (!PyTuple_Check(obj) || PyTuple_GET_SIZE(obj) != 2)
        return 0;
    key = PyTuple_GET_ITEM(obj, 0);
    value = PyTuple_GET_ITEM(obj, 1);
    result = PyDict_GetItemRef((PyObject *)dv->dv_dict, key, &found);
    if (result == 1) {
        result = PyObject_RichCompareBool(found, value, Py_EQ);
        Py_DECREF(found);
    }
    return result;
}

If you don't speak C, here's what it does:

Check that view is correct - namely that it has a reference to the underlying dictionary (and return False if not, for some reason)
Check the input: if it's not a 2-tuple, return False.
Look up a key in the underlying dictionary.
If found, check if value matches. If it does, return True.
Otherwise, return False.

The core of this is the fact that dict.items returns not a list, but a special dict_items view. The term view here means that it does not represent a copy, but only a proxy to retrieve underlying items in some specific way. The same applies to dict_keys and dict_values, you can read more in the excellent documentation page on this topic.

Note that this is specific to CPython implementation and may be wrong for other implementations.

Why is fast lookup possible for dict.items()?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in DICTIONARY

Related Questions in HASHTABLE

Trending Questions

Popular # Hahtags

Popular Questions