We have a need to greatly reduce the bandwidth our back-end services use when pulling and pushing data to sql. The TDS Stream used by SqlClient is fairly bloated. For years, people have requested a compression option when pulling from sql, but Microsoft has not added it.
I wanted to see if anyone had any thought on the best way to handle this. Here is what I've tried so far:
I modified https://github.com/MindFlavor/TDSBridge to add compression and decompression right at the socket layer. Because the payload is SSL encrypted, it didn't make much difference.
Next I took the IDataReader to Protobuf library found: https://github.com/dotarj/protobuf-net-data and TCP framework found at https://github.com/jchristn/WatsonTcp in an attempt to create a client server proxy of sorts to stream the IDataReader over the wire by converting it to protobuf, then compressing this stream, and doing the opposite on the other end.
I got a proof of concept to work here, and actually got between 84% and 98% reduction in raw bytes over the network vs plain TDS Streams. The downside is the WatsonTcp wants you to pass in the content length when assigning the stream. But there is no way to know that until you create the whole protobuf stream. We sometimes transfer hundreds of gigs in one swoop, so that won't work.
I didn't see how protobuf-net-data could stream over grpc, and even if it could, I fear the granular nature of the records in an IAsyncEnumerable may slow down a large transfer.
No doubt I can sit and write a fully custom raw socket compressed protobuf over TCP streaming implementation with the surface area for the clients being close to the SqlCommand, I just know this is notoriously difficult to get right.
Any time saving ideas? If not maybe I'll make an open source project out of it.
Here's a pattern you can use to take a large query and transmit it as a series of batches, where each batch is a compressed, binary-serialized DataTable. After transmission and deserialization each DataTable can be directly used by SqlBulk Copy. The same pattern can work with other formats, but would require an additional converter before passing to SqlBulkCopy.