Virtual memory increases when importing a module from multiple threads

31 Views Asked by At

I have a module "script" and while importing a class from it, consumes some memory ~450 MB (virt) and 100 MB (rss)

But with threads, the virtual memory usage is abnormally high - 13 GB.

My question is, why does the virtual memory shootup in this case and how can I manage it ?

In [1]: def imp():
   ...:     from script import S1
   ...: 
   ...: mem = psutil.Process().memory_info()
   ...: imp()
   ...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
   ...: 
Out[1]: (484839424, 108421120)

In [2]: ByteCount(484839424)
Out[2]: ByteCount(484839424) # 462 MiB

In [3]: ByteCount(108421120)
Out[3]: ByteCount(108421120) # 103 MiB

If I try to do the same with threads, the virtual memory usage shoots up abnormally

In [1]: def imp():
   ...:     from script import S1
   ...: 
   ...: mem = psutil.Process().memory_info()
   ...: threads = [threading.Thread(target=imp) for _ in range(200)]
   ...: for thread in threads: thread.start()
   ...: for thread in threads: thread.join()
   ...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
Out[1]: (13912137728, 113606656)

In [2]: ByteCount(13912137728)
Out[2]: ByteCount(13912137728) # 13 GiB

In [3]: ByteCount(113606656)
Out[3]: ByteCount(113606656) # 108 MiB

I added threading.lock to prevent any race condition and make it thread sage. But the issue was still there

In [1]: lock = threading.Lock()
   ...: def imp():
   ...:   with lock:
   ...:     from script import S1
   ...: 
   ...: mem = psutil.Process().memory_info()
   ...: threads = [threading.Thread(target=imp) for _ in range(200)]
   ...: for thread in threads: thread.start()
   ...: for thread in threads: thread.join()
   ...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
Out[1]: (13912137728, 113274880)

In [2]: ByteCount(13912137728)
Out[2]: ByteCount(13912137728) # 13 GiB

In [3]: ByteCount(113274880)
Out[3]: ByteCount(113274880) # 108 MiB

I then added a condition to not import if it is already imported but still faced the same proble

In [1]: lock = threading.Lock()
   ...: import sys
   ...: def imp():
   ...:   with lock:
   ...:       if "script" not in sys.modules:
   ...:           print("Importing")
   ...:           from script import S1
   ...: 
   ...: mem = psutil.Process().memory_info()
   ...: threads = [threading.Thread(target=imp) for _ in range(200)]
   ...: for thread in threads: thread.start()
   ...: for thread in threads: thread.join()
   ...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
Importing
Out[1]: (13912137728, 112840704)

In [2]: ByteCount(13912137728)
Out[2]: ByteCount(13912137728) # 13 GiB

In [3]: ByteCount(112840704)
Out[3]: ByteCount(112840704) # 108 MiB

0

There are 0 best solutions below