I have a stored procedure which gives me a result set consisting of a single column which contains millions of unprocessed rows. I need to transfer these data to another server using SqlBulkCopy, but the problem is that I can't simply do the following:
using (var con = new SqlConnection(sqlConnectionStringSource))
{
using (var cmd = new SqlCommand("usp_GetUnprocessedData", con))
{
cmd.CommandType = CommandType.StoredProcedure;
con.Open();
using (var reader = cmd.ExecuteReader())
{
using (var sqlBulk = new SqlBulkCopy(sqlConnectionStringDestination))
{
sqlBulk.DestinationTableName = "BulkCopy";
sqlBulk.BulkCopyTimeout = 0;
sqlBulk.BatchSize = 200000;
sqlBulk.WriteToServer(reader);
}
}
}
}
because data won't be processed at all.
In my case, the nth row of the result set looks like this:
value1_n,value2_n,value3_n
where n is just a subscript I've introduced to distinguish between the various rows.
In the destination table, which I've named BulkCopy, I'd like to have:
╔══════════╦══════════╦══════════╗
║ Field1 ║ Field2 ║ Field3 ║
╠══════════╬══════════╬══════════╣
║ Value1_1 ║ Value2_1 ║ Value3_1 ║
║ Value1_2 ║ Value2_2 ║ Value3_2 ║
║ ... ║ ... ║ ... ║
║ Value1_n ║ Value2_n ║ Value3_n ║
╚══════════╩══════════╩══════════╝
I was being told to use a custom DataReader via an implementation of the IDataReader interface, in order to process data row by row before SqlBulkCopy copies the data from it, using EnableStreamingProperty = true to ensure that only a small amount of data are in memory, but I have no idea where to start.
Can you help me, please?
Let's reverse the problem. Instead of finding a generic solution, create one specific for this problem. Having spent days creating an IDataReader wrapper I know it's not that trivial.
We know how many fields there are, we don't care about any other fields in the results. Instead of trying to correctly implement an IDataReader wrapper, we could create an iterator method to split the data and return the records one by one in a streaming fashion. FastMember's ObjectReader can wrap an
IDataReaderinterface over any IEnumerable :The importing method could change to :
The iterator calls
Read()only when SqlBulkCopy requests a new record, so we don't end up loading everything in memory.And the IDataReader wrapper
Resharper and Visual Studio 2019 offer to implement an interface by delegating calls to a wrapped class. In Visual Studio 2019 this is called
Implement interface through 'field_name'.Starting with this code :
Applying the refactoring gives :
To create a splitting wrapper, we need to replace
Read()with our own version :This splits the CSV values and stores them in a string. This shows why implementing the wrapper as a bit of a bother - we need to handle quite a few things and decide what to do in unexpected situations like nulls, empty strings etc.
After that, we need to add our own implementations for the methods called by SqlBulkCopy.
GetValue()is definitelly called, so isFieldCount. Other members are called based on the column mapping types, by name or by ordinal.And now the "funny" parts. What about
GetName()? Probably :GetOrdinal? It may be called in name mapping. Getting tricky :Let's hope this works.
We also need to override the indexes :
What did I forget? ... Still need to handle arbitrary value numbers. Need to handle nulls. There's no
GetSchemaTablewhich probably means the column mappings will have to be specified explicitly, probably by ordinal.A quick&dirt
IsDbNullimplementation could be :GetSchemaTableis harder because we don't really know how many values are in each record. The table has 20+ columns so I'd rather not write that code until I see that it's needed.Leave it as an excercise to the readeras they sayPPS: Default interface implementations, because why not
All this is probably a nice if convoluted case where C# 8's default interface methods could be used to create a wrapped reader trait. By default, defer to the wrapped inner reader. This would eliminate all deferred calles in the implementation.
This feature doesn't work in the C# 8 compiler that came with VS 2019 and somehow crashes Sharplab.io. No idea if it will compile or if the overrides are really needed.