I have the following setup: a few dozen Python modules running which collect data, they send these data to an aggregator which is a Golang module, using a Unix socket connection (one file for everybody).
The aggregator crunches numbers and sends the aggregated data back to the Python modules on the same socket connection.
It does work. However, there's an ever increasing delay in between the sending of packets from the aggregator and the receipt of said packets. In a half an hour it can get up to a minute long delay.
I tried logging everything and I saw that there's no delays on the sending side (from the aggregator). But when it's received in the Python modules the packets arrive with an ever increasing delay. It happens even if the Python module only uses the socket connection for listening, and it's doing nothing else but waits for data to come in on the socket connection.
The setup is quite complex but here are the main portions of the code which are relevant:
# creation of the socket
import socket
import asyncio
#...
loop = asyncio.get_event_loop()
sock= Socket(socket.AF_UNIX)
sock.setblocking(False)
await loop.sock_connect(sock, '/tmp/mysock')
# receipt of the packages
loop = asyncio.get_event_loop()
while True:
# ...
print("DBG: before socket read @",datetime.now().timestamp())
async with asyncio.timeout(Timeout):
newFrag=await loop.sock_recv(sock, BUFFSIZE)
print("DBG: after socket read @",datetime.now().timestamp())
# ...
For example here's some lines from the logs where two packets were sent with only 1 millisecond apart but they arrived 0.65 seconds apart.
...
DBG: after socket read @ 1695743800.925668 [1]
Received secData: (False, {'Ts': 1695743796, 'Main': 354.76, 'Secondary': 33678.9, 'SendTs': 1695743797.713 [2] })
DBG: before socket read @ 1695743800.926059 [3]
DBG: after socket read @ 1695743801.57975 [4]
Received secData: (False, {'Ts': 1695743797, 'Main': 354.76, 'Secondary': 33678.8, 'SendTs': 1695743797.714 [5]})
DBG: before socket read @ 1695743801.585009
...
Timestamp [1] and timestamp [2] show that there's already 3.2 sec delay. The difference between [1] and [4] shows that there was a delay of 0.65 sec between getting the two packets. The difference between [2] and [5] shows only 1ms between the sending of the two packets.
The above example from a Python script that did nothing else than listening to incoming packets.
I'm out of ideas as to why this is. I could understand a few hundred ms delay here and there but not the accumulation of the delays. Any clues for the underlying cause would be appreciated.