Application crashed occasionally when using AKK.Net as the communication channel

91 Views Asked by At

we are using AKKA.Net for the communication between several processes.

The version of AKKA.Net we are using is newest: 1.0.7. There are about 20 processes which using AKKE.Net. Those processes are windows services. The communication workload between processes is not heavy, 10 requests/minute, and the workload in night is zero. The network is not very stable. The AKKA.Cluster is not been used in the system.

We add a event handler for AppDomain.CurrentDomain.UnhandledException, so that we have the chance to log some critical exceptions. The code looks like:

  AppDomain.CurrentDomain.UnhandledException += (sender, eventArgs) =>
  {
    logger.LogFatal("Unhandled exception captured, Terminating:" + eventArgs.IsTerminating);
  };

We keep thoese processes running for several days, and found that some processes (maybe 2 or 3 ) crashed. We check the log and seems it results from a unexcepted exception of AKKA.Net. The details of the exception is listed as below:

Exception message:Object reference not set to an instance of an object.
Exception stacktrace:
   at Helios.Reactor.Tcp.TcpProxyReactor.CloseConnection(Exception ex, IConnection remoteHost)
   at Helios.Reactor.Tcp.TcpProxyReactor.ReceiveCallback(IAsyncResult ar)
   at System.Net.LazyAsyncResult.Complete(IntPtr userToken)
   at System.Net.ContextAwareResult.CompleteCallback(Object state)
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Net.ContextAwareResult.Complete(IntPtr userToken)
   at System.Net.LazyAsyncResult.ProtectedInvokeCallback(Object result, IntPtr userToken)
   at System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)
   at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)

The workload at the time when the exception thrown out is almost zero.

Is there any tips on how to fix this issue? Thanks a lot.

1

There are 1 best solutions below

3
Aaronontheweb On

This is a known bug in Helios that I logged recently - working on getting a fix out for it ASAP.

I'll reply back here with a comment once the fix is out, but it's what I've been currently working on this week.