caching python class instances

797 Views Asked by At

I have a memory heavy class, say a type representing a high-resolution resource (ie: media, models, data, etc), that can be instantiated multiple times with identical parameters, such as same filename of the resource loaded multiple times.

I'd like to implement some sort of unbounded caching on object creation to memory reuse identical instances if they have the same constructor parameter values. I don't care about mutability of one instance affecting the other shared ones. What is the easiest pythonic way to achieve this?

Note that neither singletons, object-pools, factory methods or field properties meet my use case.

2

There are 2 best solutions below

7
jprebys On

You could use a factory function with functools.cache:

import functools

@functools.cache
def make_myclass(*args, **kwargs):
    return MyClass(*args, **kwargs)

EDIT: Apparently you can decorate your class directly to get the same effect:

@functools.cache
class Foo:
    def __init__(self, a):
        print("Creating new instance")
        self.a = a

>>> Foo(1)
Creating new instance
<__main__.Foo object at 0x0000021D7D61FFA0>
>>> Foo(1)
<__main__.Foo object at 0x0000021D7D61FFA0>
>>> Foo(2)
Creating new instance
<__main__.Foo object at 0x0000021D7D61F250> 

Note the same memory address both times Foo(1) is called.

Edit 2: After some playing around, you can get your default-respecting instance cache behavior if you override __new__ and do all of your caching and instantiation there:

class Foo:
    _cached = {}
    
    def __new__(cls, a, b=3):
        attrs = a, b
        if attrs in cls._cached:
            return cls._cached[attrs]
            
        print(f"Creating new instance Foo({a}, {b})")

        new_foo = super().__new__(cls)
        new_foo.a = a
        new_foo.b = b
        cls._cached[attrs] = new_foo  
        return new_foo
     
a = Foo(1)
b = Foo(1, 3)
c = Foo(b=3, a=1)
d = Foo(4)

print(a is b)
print(b is c)
print(c is d)

output:

Creating new instance Foo(1, 3)
Creating new instance Foo(4, 3)
True
True
False

The __init__ will still be called after __new__, so you will want to do your expensive initialization (or all of it) in __new__ after the cache check.

0
MEE On

The decorator in the answer above will break inheritance, isinstance, issubclass, and will not work with dataclasses. The following is a better alternative that does not suffer from these shortcomings:

import functools
from dataclasses import is_dataclass


def cached_class(cls):
    @functools.wraps(cls.__new__)
    def __new__(cls, *args, **kwargs):
        return cls.__cache__(cls, args, tuple(kwargs.items()))

    @functools.cache
    def __cache__(cls, args, kwargs):
        if is_dataclass(cls):
            it = object.__new__(cls)
        else:
            it = cls.__orig_new__(cls, *args, **dict(kwargs))
        return it

    cls.__cache__ = __cache__
    cls.__orig_new__ = cls.__new__
    cls.__new__ = __new__
    return cls

Use as:

@cached_class
class Foo:
   ...


>>> x = Foo()
>>> y = Foo()
>>> x is y
True