Background:
I have created a project in which I capture a .pcapng file via Wireshark and then run a set of analyses on it in Python utilizing the Pyshark package before sending it to a Postgres db. I would love to convert the arcitecture to continuously sniff packets and perform analyses instead of capturing via Wireshark then processing.
Question:
In the event that my (spaghetti) code related to analysis cannot keep up with the influx of packets, what are the consqeuences in Pyshark? I do not know how the generator implementation in sniff_continuously() works (nor do I fully understand generators, generally). Where does this backlog of captured packets get collected (in memory, on disk, passed / dropped, etc.). In the event these packets are stored in memory, what are the consequences of accumulating beyond available memory (or what other things should I watch out for)?
Testing:
I ran a quick script using Pyshark that continuously sniffed all packets coming in on Eternet and printed the current machine time, the packet sniff time, and memory utilized (via this SO post) with a time delay which would allow packets to "accumulate".
From what I can tell Pyshark is not dropping packets (as the packet sniff time is incrementing ever so slightly while machine time is marching forward using a delay. This would cause me to believe that the sniffed packets are being "stored" somewhere. My rolling average memory usage is increasing slightly - is this the case? (graph here)
Please see below the code used for my testing purposes:
import os, psutil, pyshark, time
from datetime import datetime
process = psutil.Process()
capture = pyshark.LiveCapture(interface='Ethernet')
print('Current Machine Time, Packet Sniff Time, Memory Utilized')
for packet in capture.sniff_continuously():
print(f'{time.time()}, {datetime.fromtimestamp(float(packet.sniff_timestamp))}, {process.memory_info().rss}') # in bytes
time.sleep(5)
Thank you