TCPClient Reading Stream

96 Views Asked by At

I have a Netty game server that sends JSON messages with a null delimiter. I had an AS3 client communicating correctly with that server but our new Unity client can't communicate properly probably due order of the messages being sent. Sometimes I get exceptions on parsing JSON strings, especially with long messages. I have tried different buffer sizes but nothing changed.

try {
    _tcpClient = new TcpClient();
    await _tcpClient.ConnectAsync(_host, _port);
    _isListening = true;

    Debug.Log("Connected...");
    UnityMainThreadDispatcher.Instance().Enqueue(DispatchConnected());

    Byte[] bytes = new Byte[BufferSize];
    StringBuilder partialMessage = new();
    while (_isListening && !_stopRequested) {
        if (_tcpClient != null && _tcpClient.Connected) {
            using (NetworkStream stream = _tcpClient.GetStream()) {
                if (stream.CanRead) {
                    try {
                        int bytesRead;
                        while ((bytesRead = stream.Read(bytes, 0, BufferSize)) > 0) {
                            string bufferMessage = Encoding.UTF8.GetString(bytes, 0, bytesRead);

                            // Append the buffer to the existing partial message
                            partialMessage.Append(bufferMessage);

                            // Check if the partial message contains the termination character
                            int terminateIndex;
                            while ((terminateIndex = partialMessage.ToString().IndexOf(TerminateCharacter)) != -1) {
                                string completeMessage = partialMessage.ToString(0, terminateIndex);
                                Debug.Log("R: " + completeMessage);
                             UnityMainThreadDispatcher.Instance().Enqueue(DispatchServerMessage(completeMessage, true)); // <-- This is where I convert to JSON

                                // Remove the processed portion from the partial message
                                partialMessage.Remove(0, terminateIndex + 1);
                            }  
                        }
                    }
                    catch (IOException ioException) {
                        Debug.LogError($"IOException: {ioException.Message}");
                    }
                    catch (Exception exception) {
                        Debug.LogError(exception);
                    }
                }

                
            }
        }
        else {
            Debug.Log("TCP Client is not connected!");
            ClientDisconnected();
            break; // Break out of the loop when the client is not connected
        }
    }

    // Process any remaining partial message after the loop
    if (partialMessage.Length > 0) {
        UnityMainThreadDispatcher.Instance().Enqueue(DispatchServerMessage(partialMessage.ToString(), true));
        partialMessage.Clear();
    }
}
catch (SocketException socketException) {
    if (socketException.ErrorCode == 10061) {
        // Debug.LogError("Connection refused!!!");
        UnityMainThreadDispatcher.Instance().Enqueue(DispatchConnectionRefused());
    }
    else {
        UnityMainThreadDispatcher.Instance().Enqueue(DispatchConnectionInterrupted());
    }
}
catch (IOException ioException) {
    UnityMainThreadDispatcher.Instance().Enqueue(DispatchConnectionInterrupted());
    // UnityMainThreadDispatcher.Instance().Enqueue(DispatchConnectionRefused());
}
finally
{
    _stopRequested = true; // Ensure the thread stops even if an exception occurs
    _tcpClient?.Close();
    _clientReceiveThread = null;
}

When I look at the errors, I can see that SOME of the messages come in an unordered way. For example, lets say the server sends three messages with null delimiter Hello\0Socket\0World, my client receives Socket -> Hello -> World.

Is there a better way to handle JSON messages? What could be wrong here? If it was a server issue, the AS3 client would also have errors.

Below is the netty initializer code.

ChannelPipeline pipeline = socketChannel.pipeline();
pipeline.addLast("timeout", new IdleStateHandler(ServerSettings.MAX_IDLE_TIME_IN_SECONDS, 0, ServerSettings.MAX_IDLE_TIME_IN_SECONDS));
pipeline.addLast(new DelimiterBasedFrameDecoder(1024 * 1024, Delimiters.nulDelimiter()));
pipeline.addLast(new StringDecoder(CharsetUtil.UTF_8));// (2)
pipeline.addLast(new StringEncoder(CharsetUtil.UTF_8)); // (1)
pipeline.addLast(new SimpleTCPHandler()); // (3)

Thanks

1

There are 1 best solutions below

2
VonC On BEST ANSWER

Based on the comments, you need to improve:

  1. Proper multi-byte character handling:

    Decoding issues can come from a multi-byte character being split across buffer boundaries. Make sure the buffer does not inadvertently split a character, which could corrupt the message and complicate finding the terminator. That might involve inspecting the last few bytes of the buffer to make sure they do not start a multi-byte character without finishing it.

    // Assuming UTF8 encoding
    bool IsPotentialMultiByteSequenceStart(byte b) {
        // Checks if the byte is the start of a multi-byte sequence in UTF-8
        return (b & 0xC0) == 0x80;
    }
    
    // Use this function to determine if the last byte of your buffer might be the start of a multi-byte character
    bool MightSplitMultiByteCharacter(byte[] bytes, int bytesRead) {
        if (bytesRead == 0) return false;
        return IsPotentialMultiByteSequenceStart(bytes[bytesRead - 1]);
    }
    
    // You might need to adjust your reading logic to account for this possibility
    
  2. Efficient message parsing: The strategy of reading into a buffer and then parsing for message terminators needs refinement to make sure it handles edge cases, like partial messages or messages that span multiple buffers.
    If messages are consistently larger than the current buffer, consider adjusting the buffer size dynamically or making sure that your logic can handle messages that span multiple reads.

    // Assuming BufferSize is appropriately sized and TerminateCharacter is defined
    while (_isListening && !_stopRequested && stream.CanRead) {
        int bytesRead = stream.Read(bytes, 0, BufferSize);
        if (bytesRead > 0) {
            // Handle potential multi-byte character split
            int endIndex = bytesRead;
            while (endIndex > 0 && MightSplitMultiByteCharacter(bytes, endIndex)) {
                endIndex--; // Adjust endIndex to make sure multi-byte characters are not split
            }
    
            string bufferMessage = Encoding.UTF8.GetString(bytes, 0, endIndex);
            partialMessage.Append(bufferMessage);
    
            ProcessMessages(partialMessage); // Process complete messages within partialMessage
        }
    }
    
    // That function processes and clears processed messages from the StringBuilder
    void ProcessMessages(StringBuilder partialMessage) {
        int terminateIndex;
        while ((terminateIndex = partialMessage.ToString().IndexOf(TerminateCharacter)) != -1) {
            string completeMessage = partialMessage.ToString(0, terminateIndex);
            Debug.Log("Received: " + completeMessage);
            // Dispatch the message for further processing
            partialMessage.Remove(0, terminateIndex + 1);
        }
    }
    

That aligns with this answer, which suggests a DIY solution.


My TerminateCharacter is \0 If sent buffer ends with an extra character 'a' for instance, last bytes will be \0\0061.
What will happen to the extra 'a' character in this solution? Won't that character removed from the rest of the buffer?

I am also confused about return (b & 0xC0) == 0x80; this part.
Will that work in my terminate character?

Regarding handling the extra character 'a' after the termination character:
When the buffer ends with a termination character followed by an extra character a (e.g., the bytes are \0 followed by the ASCII representation of a), the a character is preserved for the next message processing cycle. The ProcessMessages function processes up to the termination character, removes the processed message including the termination character from partialMessage, and any subsequent characters (like a) remain in partialMessage for processing during the next cycle.
That means the extra a character will not be removed or lost; instead, it will be the starting point of the next message to be processed.

The expression return (b & 0xC0) == 0x80 is used to identify continuation bytes in UTF-8 encoded characters. UTF-8 characters can range from 1 to 4 bytes, where the first byte indicates the number of bytes in the character, and subsequent bytes (continuation bytes) follow the pattern 10xxxxxx. That method checks if a byte is a continuation byte by masking it with 0xC0 (which isolates the two most significant bits) and comparing the result with 0x80. If the result is true, the byte is a continuation byte, meaning it is part of a multi-byte character that started in previous bytes.

That check is important for making sure that you do not split a multi-byte character across reads from the buffer. However, it does not directly impact the handling of your termination character \0. The termination character \0 (null character) is a single-byte character in UTF-8, not a continuation byte, so this specific check ((b & 0xC0) == 0x80) is not used to identify or handle the termination character itself. Instead, it helps prevent the accidental splitting of multi-byte characters that could corrupt the parsing of your UTF-8 encoded messages.