Can I preallocate a list with some size and then use the append method to populate it?

66 Views Asked by At

Often when I write some code I would have a rough idea of how long the list will be. I guess if I try really hard I can calculate the proper size of it at the beginning, but that's just a lot of things I need to take care of (like making sure there is no extra None at the end that messes things up or making sure I don't index out of the range later).

It seems like I would have to choose from pre-allocate some memory and index into it. Or just create an empty space and use the list.append() method to populate my list. Like either this:

A = [None]*1000
for i in range(1000):
    A[i] = 1

or this:

B = []
for i in range(1000):
    B.append(1)

My question is are there some intermediate methods? Can I do something like pre-allocate around 1000 units of memory to a list, but when I append it add after the last cell with the actual item instead of after the cell I set to be None?

The idea is something like this (but this doesn't actually work):

A = [None]*1000
for i in range(1000):
    A.append(1)

Is this possible with the Python list? Or do I have to use some other data structure? Or it's just too random a thought?

1

There are 1 best solutions below

2
NotAName On

I don't really understand why you'd want to do this. You're unlikely to achieve any gains in terms of performance by pre-allocating a 1000 elements, but you can create a user defined class along the lines of:

from typing import Any

class FixedSizeList:
    def __init__(self, size: int, default_val: Any):
        self._val = [default_val for _ in range(size)]
        self._non_default_idx = 0
        
    def append(self, value: Any):
        self._val[self._non_default_idx] = value
        self._non_default_idx += 1
    
    def extend(self, value: list[Any]):
        self._val[self._non_default_idx: self._non_default_idx + len(value)] = value
        self._non_default_idx += len(value)
        
    def __str__(self):
        return self._val.__str__()

And then you can use it like this:

f = FixedSizeList(10, None)
print(f)         # [None, None, None, None, None, None, None, None, None, None]

f.append(1)
print(f)         # [1, None, None, None, None, None, None, None, None, None]

f.extend([1, 2, 3])
print(f)         # [1, 1, 2, 3, None, None, None, None, None, None]

The end result is likely to be much slower since the base list uses C under the hood and any changes to the normal behaviour will actually reduce the performance:

from timeit import timeit

t1 = """
f1 = FixedSizeList(1000, None)
for i in range(1000):
    f1.append(i)
"""

t2 = """
f2 = []
for i in range(1000):
    f2.append(i)
"""

print(timeit(t1, globals=globals(), number=10000))    # 2.07
print(timeit(t2, globals=globals(), number=10000))    # 0.50