Our team is trying to scale out our current Elastic Search cluster. In doing this, we took an AMI of a current elastic node and used that AMI to create the potential 4th elastic node. In the past, Chef was used to configure new elastic nodes. However, the designer of those recipes left our team and we are at a loss. When I try to bootstrap the new host, I get the below error:
Recipe Compile Error in /var/chef/cache/cookbooks/av_elastic/recipes/elastic_cluster.rb
================================================================================
Net::HTTPServerException
------------------------
400 "Bad Request"
Cookbook Trace:
---------------
/var/chef/cache/cookbooks/av_elastic/recipes/elastic_cluster.rb:134:in `from_file'
Relevant File Content:
----------------------
/var/chef/cache/cookbooks/av_elastic/recipes/elastic_cluster.rb:
127: message "Block devices available to Elasticsearch: #{devices}"
128: level :warn
129: end
130:
131: ## Gather Available Nodes within same es-cluster-name, Chef Environment, and elastic_cluster role. Exclude marvel nodes
132: elasticsearch_cluster_nodes = Array.new
133: elasticsearch_cluster_node_names = Array.new
134>> search(:node, "chef_environment:#{node.chef_environment} AND roles:*elastic_cluster AND es-cluster-name:#{node['es-cluster-name']} NOT roles:*elastic_marvel").each do |node|
135: elasticsearch_cluster_nodes << node
136: elasticsearch_cluster_node_names << node['hostname']
137: end
138:
Using the debug option, I can see this in the chef-client output:
[2020-11-24T15:58:00+00:00] DEBUG: ---- HTTP Response Body ----
[2020-11-24T15:58:00+00:00] DEBUG: {"error":["invalid search query: 'chef_environment:production AND roles:*elastic_cluster AND es-cluster-name: NOT roles:*elastic_marvel'"]}
[2020-11-24T15:58:00+00:00] DEBUG: ---- End HTTP Response Body -----
[2020-11-24T15:58:00+00:00] DEBUG: Chef::HTTP calling Chef::HTTP::ValidateContentLength#handle_response
[2020-11-24T15:58:00+00:00] DEBUG: Expected JSON response, but got content-type ''
Since we are re-using a previous host, I've already updated the /etc/hosts and /etc/hostname files along with removing the /etc/chef/client.pem file. I think the issue is with authentication, but I can't prove it. I also think that something might be left behind on this host that is still pointing/thinking that it's the other host (the one that created the AMI).
The current running elastic nodes, that are using the same recipes as the new host, are all working/running per design. Any ideas on how to fix? Thank you in advance