strace and tcpdump - data sent in strace shows 000000174 but data in tcpdump shows 00000174 (one zero missing)

147 Views Asked by At

I am trying to debug issue with an application that intermittently has problems. The applications communicates with one another via web sockets.

The client normally sends control code 000000174.A.EXAMP to the server end.

I can see both the client and server both have the same control code when performing an strace. Here is the strace snippet which shows the correct code being sent correctly.

3651  23:13:51 sendto(117, "0", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "0", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "0", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "0", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "0", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "0", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "1", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "7", 1, 0, NULL, 0) = 1
3651  23:13:51 sendto(117, "4", 1, 0, NULL, 0) = 1

Here's the code on the server end

00:16:36 recvfrom(11, "00000017", 8, 0, NULL, NULL) = 8

When the problem occurs, the server receives the code 00000174.A.EXAMP which is one zero short. At the moment, there is only a script on the server that logs the strace output to file and it creates three 5-minute logs and rotates logs keeping a maximum of three logs. Script stops on error detection on the server (when server has made no requests for some duration).

My question is with the packet trace I am getting from tcpdump. Nearly all the packets captured has the data as 00000174.A.EXAMP which is one zero short.

Why is this different from the strace output which has six zeroes and the packet capture shows mostly five zeroes?

Out of the several thousands of packets exchanged, several packets have missing characters when performing string search. Here's the different contents in the data packets captured through ```tcpdump``

00000174.A
00000174.A.
00000174.A.E
00000174.A.EX
00000174.A.EXA
00000174.A.EXAM
00000174.A.EXAMP (Most are coming out as with one zero missing)
0000174.A.EXAMP
000174.A.EXAMP
00174.EXAMP

IP packets are 1500 bytes in size so I wouldn't expect the packet to be split in into multiple packets since the control data is very small. Why are data packets missing characters?

The java code that sends data is d3sOut.writeBytes(slen + outBuffer). At the other end is a C web socket.

2

There are 2 best solutions below

5
SKi On

Because the client seems to use one sendto() call for each byte of 000000174.A.EXAMP, it is typical that 1st byte will be transferred in own IP packet. And rest of the bytes will be transferred after that.

It is result of "Nagle's algorithm" of TCP:

When the 1st sendto() call is done, TCP can send the 1st byte immediately, because there is not unacknowledged data in the fly.

When rest of sendto() calls are called, there is unacknowledged data in fly (the 1st byte), so the TCP stack will enqueue data in the buffer for while, before sending several buffered bytes by using a single IP packet.

Please, Double check that you didn't miss/ignore the IP packet with the 1st byte.

For avoiding ineffective small IP packets, try to use one sendto() call instead of sendto() per a byte.

0
Stephen C On

As @Ski explains, it is likely that the TCP packets are not preserving your application's implied message structure. (We can't be sure, because you didn't show us any code, and your evidence is incomplete.)

Why? Well, the TCP protocol and stream socket abstract don't make any guarantees that bytes will be assembled into network packets as you are apparently expectin. Indeed, since you are apparently performing multiple sendTo calls to write a single application message, it possible that the sender side is launching a TCP packet before it has been given a complete message. (It will depend on the sender's TCP stack's implementation, configuration parameters and also on network conditions.)

But the fix is more complicated than simply writing complete messages with a single sendTo. That will help, but it is not a guaranteed fix. TCP doesn't know anything about the application message structure, and it doesn't guarantee to preserve it.

The real solution is to design your application protocol so that its message structure is inherent in the data. For example:

  • A message could consist of a fixed number of bytes.

  • A message could start with a specific start byte value, and end with another byte value.

  • A message could start with a message length encoded as one or more bytes, and then be followed by that precise number of bytes.

But in all cases, the receiving application code cannot assume that recvfrom is delivering a single complete message. Rather, it is responsible for correctly splitting the incoming stream of bytes into messages. Yes ... it is more complicated ... but you can't "punt" the problem to the TCP stack. TCP cannot do it for you.