I have a module "script" and while importing a class from it, consumes some memory ~450 MB (virt) and 100 MB (rss)
But with threads, the virtual memory usage is abnormally high - 13 GB.
My question is, why does the virtual memory shootup in this case and how can I manage it ?
In [1]: def imp():
...: from script import S1
...:
...: mem = psutil.Process().memory_info()
...: imp()
...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
...:
Out[1]: (484839424, 108421120)
In [2]: ByteCount(484839424)
Out[2]: ByteCount(484839424) # 462 MiB
In [3]: ByteCount(108421120)
Out[3]: ByteCount(108421120) # 103 MiB
If I try to do the same with threads, the virtual memory usage shoots up abnormally
In [1]: def imp():
...: from script import S1
...:
...: mem = psutil.Process().memory_info()
...: threads = [threading.Thread(target=imp) for _ in range(200)]
...: for thread in threads: thread.start()
...: for thread in threads: thread.join()
...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
Out[1]: (13912137728, 113606656)
In [2]: ByteCount(13912137728)
Out[2]: ByteCount(13912137728) # 13 GiB
In [3]: ByteCount(113606656)
Out[3]: ByteCount(113606656) # 108 MiB
I added threading.lock to prevent any race condition and make it thread sage. But the issue was still there
In [1]: lock = threading.Lock()
...: def imp():
...: with lock:
...: from script import S1
...:
...: mem = psutil.Process().memory_info()
...: threads = [threading.Thread(target=imp) for _ in range(200)]
...: for thread in threads: thread.start()
...: for thread in threads: thread.join()
...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
Out[1]: (13912137728, 113274880)
In [2]: ByteCount(13912137728)
Out[2]: ByteCount(13912137728) # 13 GiB
In [3]: ByteCount(113274880)
Out[3]: ByteCount(113274880) # 108 MiB
I then added a condition to not import if it is already imported but still faced the same proble
In [1]: lock = threading.Lock()
...: import sys
...: def imp():
...: with lock:
...: if "script" not in sys.modules:
...: print("Importing")
...: from script import S1
...:
...: mem = psutil.Process().memory_info()
...: threads = [threading.Thread(target=imp) for _ in range(200)]
...: for thread in threads: thread.start()
...: for thread in threads: thread.join()
...: (psutil.Process().memory_info().vms - mem.vms), (psutil.Process().memory_info().rss - mem.rss)
Importing
Out[1]: (13912137728, 112840704)
In [2]: ByteCount(13912137728)
Out[2]: ByteCount(13912137728) # 13 GiB
In [3]: ByteCount(112840704)
Out[3]: ByteCount(112840704) # 108 MiB