Rebuild a compromised cluster on Elasticsearch

54 Views Asked by At

I'm using an elasticsearch cluster with 3 nodes (one of them is a master and the other two are master-eligible). Unfortunately, all of them have been stopped at the same time and after restarting them I'm encountering 2 different problems:

  1. The cluster is not able to elect a master node anymore (logs from my linux machine below):
[2024-01-08T12:07:34,186][WARN ][o.e.c.c.ClusterFormationFailureHelper] [masterNodeES1] master not discovered or elected yet, an election requires at least 2 nodes with ids from [Wp2ThiNCT-xpIskc0FTITg, 7PZ-SP5usdoKrL4tjfSMgA, nLFU0ydhTgsVItQNFL3T2n], have discovered [{masterNodeES1}{7PZ-SP5usdoKrL4tjfSMgA}{Z2B4Rja5TneDDm7N6fGYjQ}{<ip_master_node>}{<ip_master_node>:9300}{dilm}{ml.machine_memory=4046721024, xpack.installed=true, ml.max_open_jobs=20}] which is not a quorum; discovery will continue using [<ip_node_2>:9300, <ip_node_3>:9300] from hosts providers and [{masterNodeES1}{7PZ-SP5usdoKrL4tjfSMgA}{Z4B4Rji5TzeDDe7N5fBYjQ}{<ip_master_node>}{<ip_master_node>:9300}{dilm}{ml.machine_memory=4046721024, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 102018, last-accepted version 13415 in term 11968
  1. I'm not able to query my cluster anymore because of failed authentication error. The logs below from my linux machine are generated by curl -u user:password -XGET '<master_node_ip>:9200/_cat/indices?pretty':
{ "error" : { "root_cause" : [ { "type" : "security_exception", "reason" : "failed to authenticate user [elastic]", "header" : { "WWW-Authenticate" : "Basic realm="security" charset="UTF-8"" } } ], "type" : "security_exception", "reason" : "failed to authenticate user [elastic]", "header" : { "WWW-Authenticate" : "Basic realm="security" charset="UTF-8"" } }, "status" : 401 }

Since I can't query the database, I can't specify the elasticsearch version (it could be 5.x.x), but anyway I noticed that elasticsearch-reset-password Tool is not present in my /bin folder. I just want to know if is there a way to restore my cluster without losing data that are inside their nodes. Thank you in advance

1

There are 1 best solutions below

5
Musab Dogan On BEST ANSWER

You can use bin/x-pack/users command in ESv5 or bin/elasticsearch-users command for ESv6 and onwards.

Elasticsearch version 5

bin/x-pack/users useradd test -p test -r superuser

Elasticsearch version 6 and onwards

bin/elasticsearch-users useradd test -p test -r superuser

test it

curl -k 'http://localhost:9200/_cluster/health?pretty' -u test:test
curl -k 'https://localhost:9200/_cluster/health?pretty' -u test:test

After create the user you can send curl request to only localhost. BUT if the .security index shards are not available you will get the same security_exception error. If it not works, check your elasticsearch logs, make sure all nodes are up and running and find a specific logs that can cause the issue.

To diagnose the issue:

  1. Check network - let's try telnet the master2 from master1. telnet ip_node_2 9300

Question: Did you lost any master node disks?