How to represent the numbers 0 to 3, taking up 2 bits of memory in python

64 Views Asked by At

I'm writing code for a Deep Q Network in python. My computer has 32GB of memory but I run into significant issues as the training goes on because the replay buffer maxes out the RAM.

I'm looking through the replay buffer code to see where I can reduce the RAM requirements. The replay buffer stores two Numpy arrays of 1 million elements with a dtype of numpy.int8.

However, only values 0, 1, 2, 3 are possible in one of the arrays, and only -1, 0, 1 in the other. Either way, it should only need 2 bits to represent each array element.

How can I create an array, where each entry takes up only 2 bits of memory as opposed to 8? I don't mind doing some degree of hardcoding, for example something like:

if bitarray[i][0] == 0 and bitarray[i][1] == 0:
    numberAtPositionI = -1
2

There are 2 best solutions below

0
AudioBubble On

Since your actions take only 2 bits you can encode 4 actions in 8 bits.

This will save memory at the expense of more computation effort to encode/decode the actions.

Eg:

actions = [0b11, 0b00, 0b10, 0b01]   # 3, 0, 2, 1

# encode actions 
encoded_actions = 0
for i, action in enumerate(actions):
    encoded_actions |= action << 2*i
print("{0:b}".format(encoded_actions))

1100011

#decode actions
actions_decoded = []
for i in range(4):
    actions_decoded.append(encoded_actions >> 2*i & 0b11)
print(actions_decoded)

[3, 0, 2, 1]

Now you can map this 4 values to any arbitrary 4 values you need using a dictionary

mapping = {0: 5, 1: 123, 2: -50, 3: 666}

So you can retrieve the mapping by doing:

mapping[0]  # will return 5
mapping[1]  # will return 123
mapping[2]  # will return -50
mapping[3]  # will return 666
0
Bryan Carty On

So, it turns out it wasn't the replay buffer that was causing my RAM to be maxed out. But I still made some changes to the replay buffer to better utilise memory.

@Sembei Norimaki had a good answer, but I found it to be overly complex. The solution I came up with was to use the bitarray

actions = bitarray(2000000)
rewards = bitarray(2000000)

Two entries in the bitarray represent a single action/reward

writeIndex = self.next_index_to_write_to*2 //next_index_to_write_to goes from 0 to 999,999
valOne = False
valTwo = False
match action:
    case 1:
        valOne = False
        valTwo = True
    case 2:
        valOne = True
        valTwo = False
    case 3:
        valOne = True
        valTwo = True
self.actions[writeIndex] = valOne
self.actions[writeIndex+1] = valTwo

Similarily, to retrieve a value I did:

action = 0
readIndex = index*2
valOne = self.actions[readIndex]
valTwo = self.actions[readIndex+1]
if valOne:
    if valTwo:
        action = 3
    else:
        action = 2
else:
    if valTwo:
        action = 1