I have a pipeline for integration testing which hangs whithout much info. It worked 2 weeks ago, and when I ran the pipeline again last week it didn't.
Locally all tests work like a charm.
Things I've tried:
- rolling back my code and pipeline.yml to the same commit as when last working
- setting up LocalDB before starting tests
- installing SQL Express version from MSI in separate yml task
- running tests from ps script where I use
Enter-VsDevShell - sqllocaldb info/stop/delete/start all gives the same versions as my local machine
- changing connection string from mssqllocaldb -> gives another error
- running on windows-2019 instead of windows-latest
- building and testing in debug build config
- Test task with publishRunAttachments: false
- added blame hang or it would run forever
- added -p:ParallelizeTestCollections=false" to
dotnet test-command - added
console;verbosity=detailedbut i cannot see anything extra of value - Server=(localdb)\mssqllocaldb in connection string
sqlcmd -l 60 -S "(localdb)\mssqllocaldb" -Q "SELECT @@VERSION;"is working
{
"ConnectionStrings": {
"StorageDbConnection": "Server=(localdb)\\mssqllocaldb;Database=StorageTestDb;Trusted_Connection=True;MultipleActiveResultSets=true",
}
}
<ItemGroup>
<PackageReference Include="FluentAssertions" Version="6.10.0" />
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.5.0" />
<PackageReference Include="Moq" Version="4.18.4" />
<PackageReference Include="Respawn" Version="6.0.0" />
<PackageReference Include="xunit" Version="2.4.2" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.4.5">
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
<PrivateAssets>all</PrivateAssets>
</PackageReference>
<PackageReference Include="coverlet.collector" Version="3.2.0">
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
<PrivateAssets>all</PrivateAssets>
</PackageReference>
</ItemGroup>
Original task from pipeline.yml:
- task: DotNetCoreCLI@2
name: dotnetTest
displayName: Running tests
inputs:
command: 'test'
projects: '**/*Tests/*.csproj'
arguments: '--no-build --configuration $(buildConfiguration) --logger "console;verbosity=detailed" --blame-hang --blame-hang-timeout 1min'
publishTestResults: false
Stack trace on blame timeout:
The active test run was aborted. Reason: Test host process crashed
Data collector 'Blame' message: The specified inactivity time of 2 minutes has elapsed. Collecting hang dumps from testhost and its child processes.
Data collector 'Blame' message: Dumping 644 - testhost.
Attachments:
D:\a\1\s\src\Storage\tests\Storage.Application.IntegrationTests\TestResults\7360506b-d062-4e10-8d30-8302c4b56dbd\testhost_644_20230707T081918_hangdump.dmp
D:\a\1\s\src\Storage\tests\Storage.Application.IntegrationTests\TestResults\7360506b-d062-4e10-8d30-8302c4b56dbd\Sequence_fd411c6d094d49478d7ee506dc44ebfd.xml
Test Run Aborted with error System.Exception: One or more errors occurred.
---> System.Exception: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
---> System.Exception: An existing connection was forcibly closed by the remote host.
at System.Net.Sockets.NetworkStream.Read(Span`1 buffer)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Read(Span`1 buffer)
at System.Net.Sockets.NetworkStream.ReadByte()
at System.IO.BinaryReader.Read7BitEncodedInt()
at System.IO.BinaryReader.ReadString()
at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.LengthPrefixCommunicationChannel.NotifyDataAvailable()
at Microsoft.VisualStudio.TestPlatform.CommunicationUtilities.TcpClientExtensions.MessageLoopAsync(TcpClient client, ICommunicationChannel channel, Action`1 errorHandler, CancellationToken cancellationToken)
--- End of inner exception stack trace ---.
The active Test Run was aborted because the host process exited unexpectedly. Please inspect the call stack above, if available, to get more information about where the exception originated from.
Are there firewalls or permissions I can set on the pipeline?
Are there any better ways of debugging or finding what's actually timing out?
Finally got it working. Found someone with similar problems as well, but not the same solution: https://github.com/actions/runner-images/issues/7683 I might have been affected by the same changes to the windows image, our troubles seem to have started around the same time.
Once I changed the pipeline task to VSTest I got some more output (Console.WriteLine debug lines added in tests and test setup) I could see the test setup failing before even getting to the test cases.
In my database seed I had a call like
await SaveChangesAsync(cancellationToken);that never got finished. I changed it toSaveChanges();and the pipeline started working.Not sure why it works though, possibly some race condition, deadlock or the like. Running locally worked fine all the time.