Python UDP socket not sending all packets, or not all packets are received on the other end

36 Views Asked by At

I'm working on an implementation of a logical data diode, which means that data can only flow in one direction. No ACKs are allowed. Therefore, I have chosen UDP. This is the gist of the protocol:

  1. Split the payload into small chunks
  2. Give each chunk a sequence number
  3. Transmit the chunk to the receiver using UDP

For small payloads, this works flawlessly. For larger payloads, however, at around 600 datagrams sent, the packets seem to mysteriously disappear in transmit. All packets are logged as sent, but on the receiving end, the packets just stop.

Here is a code snippet:

        for i in range(redundancy + 1):
            seq = 0
            sent_bytes = 0
            bytes_to_send = len(session.encrypted_data)
            while sent_bytes < bytes_to_send:
                logger.info("Sending payload chunk %d", seq)
                header = concat_bytes(str(seq).zfill(16).encode(), session.session_uuid.bytes_le)
                remaining_room = BUFFER_SIZE - len(header)
                data = session.encrypted_data[sent_bytes : sent_bytes + remaining_room]
                payload = concat_bytes(header, data)
                self._transmit_bytes(payload)
                sent_bytes += len(data)
                seq += 1

_transmit_bytes:

    def _transmit_bytes(self, message: bytes):
        self.server_socket.sendto(message, self.addr)
        time.sleep(MESSAGE_DELAY)

The initialization of server_socket:

        self.server_socket: LDDSocket = LDDSocket(listen=False)
        so_linger_options = struct.pack("ii", 1, CLOSE_TIMEOUT)
        send_buffer_size = 1024 * 1024 * 100  # 100 MB
        self.server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_LINGER, so_linger_options)
        self.server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, send_buffer_size)

LDDSocket is an extremely simple wrapper for socket.socket. There is nothing you need to know about it (listen=False just makes it not bind).

As you can see, I have tried some tips that I found online, including:

  • Setting SO_LINGER options (CLOSE_TIMEOUT is 10 seconds)
  • Setting SO_SNDBUF options
  • Before closing the socket, I'm waiting 10 seconds. I expected that I would still see packets being received during these 10 seconds, but the receiving stops well before that timeout and the subsequent socket.close() call happen.

The last point is done in the __exit__ method of the "transmitter class", which is used as a context manager:

    def __exit__(self, exc_type, exc_val, exc_tb):
        cleanup_grace_period = 10  # seconds
        logger.info("Waiting %d seconds for cleanup...", cleanup_grace_period)
        time.sleep(cleanup_grace_period)
        logger.info("Cleanup presumed to be complete; closing socket.")
        self.close()
        logger.info("Socket closed.")

Again, the receiving end stops receiving messages well before this __exit__ method is called.

What could cause a socket to stop sending data after a certain amount of data has been sent? (It always ends roughly around the same sequence number, but not exactly.)

1

There are 1 best solutions below

2
404usernamenotfound On

My problem appears to be related to buffer sizes.

An investigation revealed:

  • All packets were not sent from the receiver (WireShark).
  • All packets were not received by the receiver (WireShark).
  • Increasing the MESSAGE_DELAY value to 100ms (meaning it waits 100ms after sending a packet before sending another one) caused all packets to be sent and received, but this is an unacceptable delay.
  • Decreasing the MESSAGE_DELAY gradually increased the chance of the last packets (never any in the middle) being lost.
  • Setting a send buffer size (SO_SNDBUF) option caused all packets to be sent (WireShark), but not received.
  • Setting a receive buffer size (SO_RCVBUF) option caused all packets to be received.

Finally, I was able to reduce MESSAGE_DELAY and still receive all packets.