I am currently developing a Windows Kernel Driver that implements its own networking stack. While testing some base functionality of the implemented stack, I noticed that replies to pings would sometimes take noticeably longer than usual. Investigating this issue further, I found out that KeAcquireSpinLock sporadically has an execution time of up to 20 ms (instead of few µs), even when the lock is not held by other cores (I confirmed this by printing the lock value before calling KeAcquireSpinLock).
Since I had no clue why KeAcquireSpinLock takes so long, I implemented a different approach with KeAcquireSpinLockAtDpcLevel, manually rising the IRQL if required:
oldIrql = KeGetCurrentIrql();
if (oldIrql < DISPATCH_LEVEL)
{
KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
}
KeAcquireSpinLockAtDpcLevel(m_lock);
// DO STH WITH SHARED RECOURCES
KeReleaseSpinLockFromDpcLevel(m_lock);
if (oldIrql< DISPATCH_LEVEL) KeLowerIrql(oldIrql);
I expected the above code to be functionally equivalent to KeAcquireSpinLock. However, it turned out that the runtime issue I had with KeAcquireSpinLock is gone and performance is fine with this approach.
I have searched the internet for similar problems with KeAcquireSpinLock, but it seems like I am alone with this issue. Maybe I have a bug in other sections of the driver? Can someone explain this behavior?
Note that I am not talking about Deadlocks, since KeAcquireSpinLock would always return at some point and the implementation with KeAcquireSpinLockAtDpcLevel uses the same architecture / locking object.