How does socket.gethostbyname () choose from multiple IP addresses?

248 Views Asked by At

I'm trying to figure out how gethostbyname() determines an IP address for a private host when multiple addresses are available. The results recently changed on my machine. I want to know why it suddenly changed the IP it returns, and what could cause such a change?

See update sections below for new info. Will update Summary if relevant.

Summary

Here's the core problem. /etc/hosts has this:

10.1.2.3        zoidberg.local zoidberg  # loopback

Python does this:

>>> socket.gethostbyname ('zoidberg')
'127.0.0.1'
>>> socket.gethostbyname_ex ('zoidberg')
('zoidberg.local', ['zoidberg'], ['10.1.2.3', '127.0.0.1', '192.168.1.62', '169.254.1.28', '10.1.2.3'])
>>> httpd = HTTPServer (('zoidberg', 8888), ReqHandler)  <-- binds to 127.0.0.1 not 10.1.2.3

Why and how does this happen? BaseServer.__init__ calls socket.bind which is beyond my control. Just like socket.gethostbyname it chooses 127.0.0.1, which is not associated with zoidberg in /etc/hosts. Neither are the other IPs returned by gethostbyname_ex (same machine, different network interfaces). Ping zoidberg in terminal correctly resolves to 10.1.2.3.

Server used to bind to 10.1.2.3, but behavior changed in last few days. What's going on?

Description

For several years, the name zoidberg has been a host name for private address 10.1.2.3. My /etc/hosts file has this entry, unchanged all that time (complete /etc/hosts at bottom):

10.1.2.3                zoidberg.local zoidberg   # loopback

I run a few local python servers that listen on certain ports. The address passed to TCPServer is always ('zoidberg', portnum). For years it's run fine, binding to 10.1.2.3 with those parameters. Machine runs python 3.8 on mac 10.15.

A few days ago the servers stopped responding. Even though they were clearly running, I couldn't get any response. Eventually I found out with netstat -an that TCPServer is now binding to 127.0.0.1 instead of 10.1.2.3 when I call TCPServer ((zoidberg, portnum)). My question is why? What can cause this to happen?

I traced through python stdlib code until I reached socket.gethostbyname (), which is how TCPServer (BaseServer) resolves the host. It seems gethostbyname is the culprit. Yes gethostbyname_ex is better, but it's stdlib code inside TCPServer that I can't modify (or shouldn't modify / won't modify). Why would gethostbyname suddenly return 127.0.0.1 for zoidberg, when for years zoidberg has resolved to 10.1.2.3? What could cause such a change?

I went through the obvious candidates:

  • /etc/hosts hasn't changed in years. only line with zoidberg is the one above. Entire hosts file posted below.
  • no OS updates have been made in months. auto-updates is off, only manual.
  • ping zoidberg still correctly resolves to 10.1.2.3. whatever change gethostbyname is picking up, it's not picked up in ping.
  • chrome (I know, doesn't control name resolution) was updated semi-recently via auto update. don't see how this would affect the system and gethostbyname.
  • no other changes to name resolution as far as I'm aware. installed a few python modules with port system but that's usually contained in /opt/local. python3.8 is system python, running from /usr/local/bin. I do have /opt/local/ dirs in my PYTHONPATH for picking up packages. But surely that wouldn't change gethostbyname resolution, would it?

My main concern is making sure the machine is free of malware. If it's just ghosts in the machine, ok, I'd like to know why but I can live with it. Should I be concerned? Or is there a more reasonable explanation? I generally keep things locked down: no optional services, no browsing dodgy websites, no running software of unknown origin, only downloading modules from primary repos, etc.

Any ideas?

python server code

Should be covered above but for completeness...

from http.server import BaseHTTPRequestHandler, HTTPServer

listen_address = ('zoidberg', 8888)
httpd = HTTPServer (listen_address, ReqHandler)

Notes:

  • HTTPServer subclasses socket.TCPServer which subclasses socket.BaseServer.
  • BaseServer.init calls BaseServer.server_bind.
  • BaseServer.server_bind (stdlib) calls socket.socket.bind on server address.
  • socket.socket.bind is not defined in stdlib .py files, so I can't trace further; it must be in a c interface file somewhere.
  • socket.gethostbyname (listen_address) returns 127.0.0.1. Presumably this resolves 'zoidberg' the same way as socket.socket.bind.
  • socket.gethostbyname is not defined in python files (C interface), so I can't trace.

This may or may not be a python problem. Every other program tested on the system so far (ping, curl, chrome) all resolve 'zoidberg' correctly to '10.1.2.3'. Python is the only oddball at this point.

Contents of /etc/hosts

##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting.  Do not change this entry.
##

10.1.2.3        zoidberg.local zoidberg  # loopback

127.0.0.1   localhost
255.255.255.255 broadcasthost
::1             localhost
# Added by Docker Desktop
# To allow the same kube context to work on the host and the container:
127.0.0.1 kubernetes.docker.internal

Contents of /etc/resolv.conf

Looks ok to me. Not really used on mac. Last updated Jan 2020.

#
# macOS Notice
#
# This file is not consulted for DNS hostname resolution, address
# resolution, or the DNS query routing mechanism used by most
# processes on this system.
#
# To view the DNS configuration used by this system, use:
#   scutil --dns
#
# SEE ALSO
#   dns-sd(1), scutil(8)
#
# This file is automatically generated.
#
nameserver 8.8.4.4
nameserver 194.168.4.100
nameserver 8.8.8.8

_______________________ Update 1 _______________________

Thanks to commenters, I realized that python only recognizes comments in /etc/hosts if the line starts with #. So my /etc/hosts entry for zoidberg wasn't treated as two names and a comment, but as four names, including # and loopback. This is what python console shows:

>>> socket.gethostbyname_ex ('zoidberg')

('zoidberg.local', ['zoidberg','#','loopback'], ['10.1.2.3', '127.0.0.1', '192.168.1.62', '169.254.1.28', '10.1.2.3'])

Here's what's interesting / odd about that:

  • despite '#' and 'loopback' being treated as aliases for zoidberg.local (at least in python), it doesn't seem to have any ill effect. Trying to ping loopback or # (to force name resolution) just results in ping: cannot resolve loopback: Unknown host. Whatever method ping uses to resolve hostnames appears to be different than what python socket uses.
  • 10.1.2.3 is both the first and last IP address in the list. No idea why it appears twice.
  • 127.0.0.1 is there, even though I never explicitly defined zoidberg as an alias for it. Perhaps because both 127.0.0.1 and 10.1.2.3 are registered as addresses for lo0, the loopback device on mac? perhaps python picks up 127.0.0.1 by examining the underlying network device lo0?
  • 127.0.0.1 is second in the list from gethostbyname_ex, but it's the only address returned by gethostbyname.
  • what's really bizarre is that both the machine's wifi address (192.) and physical ethernet address (169.) are returned for 'zoidberg'. These addresses are never associated with zoidberg in /etc/hosts or anywhere else I know. They're not even mentioned in /etc/hosts; thoses addresses are assigned dynamically by other devices (routers) using DHCP on separate network interfaces (en0 and en1).

Why would python gethostbyname_ex pick up 192 and 169 addresses from other network interfaces? Is gethostbyname_ex polling all network interfaces for addresses? That seems really strange.

I removed the '# loopback' comment from /etc/hosts to see if it fixed things. Only result is that python no longer picks those up as aliases for zoidberg. However gethostbyname still returns 127.0.0.1 for zoidberg, and gethostbyname_ex still returns all the other IPs. Results:

>>> socket.gethostbyname ('zoidberg')
'127.0.0.1'
>>> socket.gethostbyname_ex ('zoidberg')

('zoidberg.local', ['zoidberg'], ['10.1.2.3', '127.0.0.1', '192.168.1.62', '169.254.1.28', '10.1.2.3'])

After removing '# loopback' from /etc/hosts I also tested the C call gethostbyname on my system. It returns the same IP list as python: 5 entries with 10.1.2.3 as first and last entry. Yet every other program on my system resolves 'zoidberg' to 10.1.2.3; python gethostbyname is the only one that resolves to 127.0.0.1. And only since a few days ago. Before that, python name resolution worked fine.

Perhaps it needs a few hours for /etc/hosts updates to propagate through the name resolution system? But if that were the case, why did python immediately stop picking up '#' and 'loopback' as aliases?

0

There are 0 best solutions below