I am trying to bring up an on-prem k8 cluster using kubespray with 3 master and 5 worker nodes. The node IPs are from 2 different subnets.
Ansible inventory:
hosts:
saba-k8-vm-m1:
ansible_host: 192.168.100.1
ip: 192.168.100.1
access_ip: 192.168.100.1
saba-k8-vm-m2:
ansible_host: 192.168.100.2
ip: 192.168.100.2
access_ip: 192.168.100.2
saba-k8-vm-m3:
ansible_host: 192.168.200.1
ip: 192.168.200.1
access_ip: 192.168.200.1
saba-k8-vm-w1:
ansible_host: 192.168.100.3
ip: 192.168.100.3
access_ip: 192.168.100.3
saba-k8-vm-w2:
ansible_host: 192.168.100.4
ip: 192.168.100.4
access_ip: 192.168.100.4
saba-k8-vm-w3:
ansible_host: 192.168.100.5
ip: 192.168.100.5
access_ip: 192.168.100.5
saba-k8-vm-w4:
ansible_host: 192.168.200.2
ip: 192.168.200.2
access_ip: 192.168.200.2
saba-k8-vm-w5:
ansible_host: 192.168.200.3
ip: 192.168.200.3
access_ip: 192.168.200.3
children:
kube-master:
hosts:
saba-k8-vm-m1:
saba-k8-vm-m2:
saba-k8-vm-m3:
kube-node:
hosts:
saba-k8-vm-w1:
saba-k8-vm-w2:
saba-k8-vm-w3:
saba-k8-vm-w4:
saba-k8-vm-w5:
I spawned dnsutils next - kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
This is on w1 worker. It is able to lookup a svc name (I have created elasticsearch pods on w2)
root@saba-k8-vm-m1:/opt/bitnami# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
coredns ClusterIP 10.233.0.3 <none> 53/UDP,53/TCP,9153/TCP 6d3h
root@saba-k8-vm-m1:/opt/bitnami# kubectl exec -it dnsutils sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
/ #
/ # nslookup elasticsearch-elasticsearch-data.lilac-efk.svc.cluster.local. 10.233.0.3
Server: 10.233.0.3
Address: 10.233.0.3#53
Name: elasticsearch-elasticsearch-data.lilac-efk.svc.cluster.local
Address: 10.233.49.187
I spawned the same dnsutils pod on w5 (.200 subnet) next. nslookup fails on this.
root@saba-k8-vm-m1:/opt/bitnami# kubectl exec -it dnsutils sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
/ #
/ # ^C
/ # nslookup elasticsearch-elasticsearch-data.lilac-efk.svc.cluster.local 10.233.0.3
;; connection timed out; no servers could be reached
/ # exit
command terminated with exit code 1
Logs from nodelocaldns running on w5:
[ERROR] plugin/errors: 2 elasticsearch-elasticsearch-data.lilac-efk.lilac-efk.svc.cluster.local. AAAA: dial tcp 10.233.0.3:53: i/o timeout
[ERROR] plugin/errors: 2 elasticsearch-elasticsearch-data.lilac-efk.lilac-efk.svc.cluster.local. A: dial tcp 10.233.0.3:53: i/o timeout
From the dnsutils container, I'm not able to reach coredns pod IPs on the other subnet, through overlay network. The cluster is spawned using Calico.
root@saba-k8-vm-m1:/opt/bitnami# kubectl get pods -n kube-system -o wide | grep coredns
pod/coredns-dff8fc7d-98mbw 1/1 Running 3 6d2h 10.233.127.4 saba-k8-vm-m2 <none> <none>
pod/coredns-dff8fc7d-cwbhd 1/1 Running 7 6d2h 10.233.74.7 saba-k8-vm-m1 <none> <none>
pod/coredns-dff8fc7d-h4xdd 1/1 Running 0 2m19s 10.233.82.6 saba-k8-vm-m3 <none> <none>
root@saba-k8-vm-m1:/opt/bitnami# kubectl exec -it dnsutils sh
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
/ # ping 10.233.82.6
PING 10.233.82.6 (10.233.82.6): 56 data bytes
64 bytes from 10.233.82.6: seq=0 ttl=62 time=0.939 ms
64 bytes from 10.233.82.6: seq=1 ttl=62 time=0.693 ms
^C
--- 10.233.82.6 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.693/0.816/0.939 ms
/ # ping 10.233.74.7
PING 10.233.74.7 (10.233.74.7): 56 data bytes
^C
--- 10.233.74.7 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
/ # ping 10.233.127.4
PING 10.233.127.4 (10.233.127.4): 56 data bytes
^C
--- 10.233.127.4 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
kube_service_addresses: 10.233.0.0/18 kube_pods_subnet: 10.233.64.0/18
Because of this behaviour, fluentd running as daemon set on all 5 workers is in CrashLoopBack since it is unable to resolve elasticsearch svc name.
What am I missing? Any help is appreciated.
Thanks to @laimison for giving me those pointers.
Posting all my observations, so it can be useful to somebody.
On M1,
On M3,
On M1, 192.168.200.2 and 192.168.200.3 are passive. On M3, I noticed Active Socket: Connection for all .100 IPs. This suggested that M3 is trying to establish a BGP connection, but it is not able to get through.
I was able to
telnet 192.168.100.x 179from M3.Checking the calico pod log and node dump from running
/usr/local/bin/calicoctl.sh node diagson M1, I could see10.0.x.x was the management IP of the server on which .200 VMs were hosted. It was doing a source NAT.
I added this rule:
That solved the issue.
Other things that I tried:
I updated ipipMode across all the nodes. This doesn't solve the issue, but helps improves performance.
I referred to calico/node is not ready: BIRD is not ready: BGP not established and set interface=ens3, although this is the only interface on my VMs. Again, doesn't solve the issue, but will help when there are multiple interfaces on the calico node.