Why does df.at rise a KeyError with a MultiIndex while df.loc does not?

73 Views Asked by At

I'm having issues with one of my functions after switching from a regular Index to a MultiIndex, and I'm not sure how to address this. Let me take the DataFrame from the pandas documentation for pandas.DataFrame.at to illustrate the problem:

>>> df = pd.DataFrame([[0, 2, 3], [0, 4, 1], [10, 20, 30]],
...                   index=[4, 5, 6], columns=['A', 'B', 'C'])
>>> df
    A   B   C
4   0   2   3
5   0   4   1
6  10  20  30
>>> df.at[4, 'B']
2

If you now convert this into a MultiIndex, the same call will fail and raise a KeyError:

>>> df = df.set_index("A", append=True)
>>> df
       B   C
  A
4 0    2   3
5 0    4   1
6 10  20  30
>>> df.at[4, 'B']
Traceback (most recent call last):
  File "<input>", line 1, in <module>
    df.at[4, "B"]
     ~~~~~^^^^^^^^
  File "/.../pandas/core/indexing.py", line 2419, in __getitem__
    return super().__getitem__(key)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pandas/core/indexing.py", line 2371, in __getitem__
    return self.obj._get_value(*key, takeable=self._takeable)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.../pandas/core/frame.py", line 3882, in _get_value
    loc = engine.get_loc(index)
          ^^^^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/index.pyx", line 822, in pandas._libs.index.BaseMultiIndexCodesEn
gine.get_loc
KeyError: 4

This kind of behavior would be fine, if loc was behaving in the same way - which it doesn't:

>>> df.loc[4, 'B']
A
0    2
Name: B, dtype: int64

You can get around this by specifying all levels of the index of course...

df.at[(4,0), 'B']
2

but given that I have quite a number of MultiIndex-levels that does not seem like an feasible solution. And using loc and then appending a .iloc[0] doesn't feel very pythonic either... Does anybody know how to make .at work without specifying more than the first level?

1

There are 1 best solutions below

7
mozway On BEST ANSWER

at is designed to select a single value in a DataFrame.

Access a single value for a row/column label pair.

Thus you must provide all indexers.

As you shown in your example, loc with an incomplete indexer yields a Series, not a value:

df.loc[4, 'B']

A
0    2
Name: B, dtype: int64

This wouldn't be compatible with at's behavior of selecting a single value.

The KeyError is the result of an explicit check for a complete indexer:

See the code of pandas/core/frame.py

        # For MultiIndex going through engine effectively restricts us to
        #  same-length tuples; see test_get_set_value_no_partial_indexing
        loc = engine.get_loc(index)
        return series._values[loc]