Service fabric cluster cant get node0 to rejoin 3 node cluster

114 Views Asked by At

We have a 3 node service fabric cluster, node 0 which was the one that we used to setup the cluster is working but not listed in the System ClusterManagerService and other ones, but is in the FailoverManagerService.

enter image description here

How can I add it back in as I'm stumped at the moment, spend most of the day on this an no wiser?

With no answers I am thinking I will just need to remove the cluster and then recreate it.

1

There are 1 best solutions below

0
Andrew On BEST ANSWER

I was unable to recover it, so remove the cluster and recreated it, with the following commands in PowerShell on one of the nodes.

Connect-ServiceFabricCluster

Get the current configuration for the cluster

Get-ServiceFabricClusterConfiguration > C:\temp\train_cluster_config_old.json

With the config we can now remove the cluster

Remove-ServiceFabricCluster -ClusterConfigFilePath train_cluster_config_old.json

You may have to remove any left over node folders under C:\ProgramData\SF or the next steps will inform you that you need to remove them.

Make sure you are happy with the config, then test it with the tools you will need on one node https://go.microsoft.com/fwlink/?LinkId=730690

.\TestConfiguration.ps1 -ClusterConfigFilePath C:\temp\train_cluster_config_old.json -FabricRuntimePackagePath C:\temp\Microsoft.Azure.ServiceFabric.WindowsServer.8.1.321.9590\DeploymentRuntimePackages\MicrosoftAzureServiceFabric.8.1.321.9590.cab

If that all succeeds then run the command that will create the cluster

.\CreateServiceFabricCluster.ps1 -ClusterConfigFilePath C:\temp\train_cluster_config_old.json -FabricRuntimePackagePath C:\temp\Microsoft.Azure.ServiceFabric.WindowsServer.8.1.321.9590\DeploymentRuntimePackages\MicrosoftAzureServiceFabric.8.1.321.9590.cab

Give it a few minutes to run and start up and then navigate to https://localhost:19080/Explorer/index.html on the nodes to makes sure its running.

You will now need to deploy all your applications again as the cluster will be empty.