Can't use object "aaa" outside of the cell where it's defined

227 Views Asked by At

I use the datalore kernel in datalore.jetbrains.com. In my notebook there are 3 following cells (this is a minimal working example on which I was able to reproduce this error):

#%%
class MyClass:
    def __getattribute__(self, name):
        return 123
#%%
aaa = MyClass()
#%%
aaa

When I try to execute the third cell I get an error, Can't use object "aaa" outside of the cell where it's defined. The message clearly implies that variable aaa can only be used inside the second cell. But why does the datalore kernel have such a limitation?

1

There are 1 best solutions below

4
hsestupin On

The short answer is, the Datalore kernel saves on disk runtime environment after executing a cell.

Why does the datalore kernel need to do it? Here comes the long answer. In order to understand the root cause of the issue we need to know how the datalore kernel executes cells.

It would be easier to grasp it if we forgot everything we know about the Jupyter kernel. The Datalore kernel differs drastically from the Jupyter kernel because it's reproducible and incremental.

Reproducibility

Have you ever been in a situation when you needed to re-run all the cells in a notebook from the very beginning because you lost track of the order in which cells were executed? Have you ever shared a notebook with somebody together with the notes which describe the cell execution order? With the datalore kernel you wouldn't need to do anything like that. It ensures that cells are always evaluated in exactly the same order, i.e. in the order in which they are defined in the notebook. Whenever you execute the N-th cell, all the previous cells are automatically calculated by the datalore kernel. You might think it must be extremely slow, but it's not. This brings us to the second key property of the kernel.

Incrementality

The Datalore kernel saves the result of every cell execution on disk. The result is simply a runtime environment. It's in fact just a dictionary of objects and their names. That's why the datalore kernel doesn't need to recalculate unchanged cells because the result is already known - it's persisted on disk. So in the typical real-world situation when you work with one cell and run this cell from time to time - previous cells are not recalculated (only the first time). This property naturally imposes the following restriction: if you want to use your object in several cells, you need to make it serializable. In the opposite case you are limited to using an object only within one cell.

P.S. In this particular example the issue is caused by the incorrect implementation of __getattribute__ method. Such an implementation implies that every invocation of getattr(aaa, attr_name, None) returns 123, which obviously doesn't work well in every case. That's why some error occurred on attempt to serialize object aaa and therefore it hasn't been saved on disk.