We all know that python dict.get() has O(1) time complexity because it need to get a hash of the key and get value, corresponding to it's key. The question is, what magic can actually access the value, corresponding to the hash? Looking for the answer I went to do_lookup method in CPython and found out the interpreter iterates over all buckets to find the needed hash.
do_lookup(PyDictObject *mp, PyDictKeysObject *dk, PyObject *key, Py_hash_t hash,
Py_ssize_t (*check_lookup)(PyDictObject *, PyDictKeysObject *, void *, Py_ssize_t ix, PyObject *key, Py_hash_t))
{
// ...
for (;;) {
ix = dictkeys_get_index(dk, i);
if (ix >= 0) {
Py_ssize_t cmp = check_lookup(mp, dk, ep0, ix, key, hash);
if (cmp < 0) {
return cmp;
} else if (cmp) {
return ix;
}
}
else if (ix == DKIX_EMPTY) {
return DKIX_EMPTY;
}
perturb >>= PERTURB_SHIFT;
i = mask & (i*5 + perturb + 1);
// ...
}
}
so due to this and also keep in mind hash collision resolver it looks like this:
O(1) only if first hash + no collision - best case
O(N) for worst case + no collision
O(N) + O(len(bucket)) for worst case + collision resolver
am I missing something or what is wrong with this logic?