I did try to reproduce the Thread Wait times from Context Switch events with TraceProcessing. The most simplicistic approach would be to sum up from all threads from all processes the wait times. Normally I want this for a specific thread but just show the issue here is the most simple code:
using ITraceProcessor processor = TraceProcessor.Create(myEtlFile, new TraceProcessorSettings
{
AllowLostEvents = true,
});
IPendingResult<IContextSwitchDataSource> myContextSwitchData = processor.UseContextSwitchData();
processor.Process();
double WaitDurationInMs;
foreach (IContextSwitch cSwitch in myContextSwitchData.Result.ContextSwitches)
{
IContextSwitchIn switchin = cSwitch.SwitchIn;
if (switchin.WaitTime.HasValue)
{
WaitDurationInMs += switchin.WaitTime.Value.TotalMilliseconds;
}
}
But apparently the switchin.WaitTime value is nowhere near any thread wait time printed by WPA. How can I get for a Thread its
- Wait Time (Thread was blocked)
- Ready Time (Thread was waiting to run in ready queue)
- CPU Time (Thread was running on one CPU)
An example how to do that would be nice. Also the Context Switch event numbers of WPA and TraceProcessor seem to be quite a bit off. I guess I need to know some internals how the events need to be correlated.

I have found the corresponding entry. You need to call on an instance of ITraceProcessor the extension method .UseCpuSchedulingData() and then you get in ThreadActivity the CPU/Ready and Wait data along with call stacks just like in WPA.
I have prepared a little sample which prints by process id from an etl file the CPU and Wait time summed across all threads:
That is nicely in sync with the output of WPA
Below is the source code. You need to add a reference to the Nuget package: Microsoft.Windows.EventTracing.Processing.All which will disappear and resolve to bunch of dependant nuget packages.
This works perfectly and it parses things very efficiently. Behind the scenes it uses the ContextSwitch data source but you need to know quite a bit of internals to understand it.
A great trick of this library is that it groups all Context Switch events by processor because a Context Switch happens always on one processor so you can process all context switch events sorted by time grouped by CPU. Internally it uses a clever combination of MinHeap which is declared as
which contains the groupings.
Once a new processor is reached the Minheap enumerates all events and creates a tree of data which is then consumed sorted by time until no events are there and then the next processor grouping es "expanded".
Besides that I really like the clean design of time stamps which convert directly to Milli, Micro, Nanoseconds decimals or DateTimeOffset values depending on what you are after. Also the clear definition of a size type which declares everything what you could need in a strong type makes it crystal clear what values in which units you get:
So far this is one of the ETW Libraries with a textbook design how you should implement a clear and easy to use API while keeping great performance.