C# performance much faster when using anonymous types over named types

127 Views Asked by At

I have a file of 280,000 lines of data that I want to read, put into a datatable, and eventually save. I have done it, but whilst tweaking the finished code I found a noticeable improvement in performance when using an anonymous type instead of a named type. I have created a small sample below that demonstrates the difference in performance that I am seeing:

            DataTable dataTable = new DataTable();

            // One column for each property
            foreach (PropertyDescriptor property in TypeDescriptor.GetProperties(typeof(IISLog)))
                dataTable.Columns.Add(property.Name, Nullable.GetUnderlyingType(property.PropertyType) ?? property.PropertyType);

            var items = rows.Select(data => new //IISLog
            {
                Date = Convert.ToDateTime(data[0]),
                Time = data[1],
                ServerIPAddress = data[2],
                Method = data[3],
                URIStem = data[4],
                URIQuery = data[5],
                ServerPort = int.Parse(data[6]),
                UserName = data[7],
                ClientIPAddress = data[8],
                UserAgent = data[9],
                Referrer = data[10],
                ProtocolStatus = int.Parse(data[11]),
                ProtocolSubStatus = int.Parse(data[12]),
                Win32Status = long.Parse(data[13]),
                TimeTaken = int.Parse(data[14])
            }).ToList();

            Console.WriteLine("Elapsed: " + stop.ElapsedMilliseconds);

            foreach (var item in items)
            {
                dataTable.Rows.Add(item.Date, item.Time, item.ServerIPAddress, item.Method, item.URIStem,
                    item.URIQuery, item.ServerPort, item.UserName, item.ClientIPAddress, item.UserAgent,
                    item.Referrer, item.ProtocolStatus, item.ProtocolSubStatus, item.Win32Status, item.TimeTaken);
            }

            stop.Stop();
            Console.WriteLine("Elapsed: " + stop.ElapsedMilliseconds);

And here is the IISLog class:

    public class IISLog
    {
        public DateTime Date { get; set; }
        public string Time { get; set; }
        public string ServerIPAddress { get; set; }
        public string Method { get; set; }
        public string URIStem { get; set; }
        public string URIQuery { get; set; }
        public int ServerPort { get; set; }
        public string UserName { get; set; }
        public string ClientIPAddress { get; set; }
        public string UserAgent { get; set; }
        public string Referrer { get; set; }
        public int ProtocolStatus { get; set; }
        public int ProtocolSubStatus { get; set; }
        public long Win32Status { get; set; }
        public int TimeTaken { get; set; }
    }

If I run the code as is, the console prints out elapsed times of 1811ms at the first print, and 2203ms when it finishes. If I uncomment "//IISLog", the console prints out elapsed times of 11807ms and 21939ms. The data in the file is IIS log data, so it is of the format:

#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status time-taken
2024-01-05 04:27:03 1.1.1.1 GET / - 443 - 2.2.2.2 Mozilla/5.0+(compatible;+NetcraftSurveyAgent/1.0;[email protected]) - 302 0 0 5925

Again, this is only sample code used to highlight what I'm seeing, but the performance issue exists in it. I'm just curious as to why the anonymous types are so much faster in being created, and why iterating them when adding the values to the datatable is also dramatically faster?

EDIT: Removed nullable int types from class

EDIT 2: I have tested the code in isolation in a new .NET Framework 4.8 project and they are both performing the same. This obviously points to something in the project that is causing the issue.

0

There are 0 best solutions below