I've setup a two node hadoop instance on google compute engine. I got an dynamic external ip address which is 108.59.84.14. I also did setup a proper hadoop configuration so the standard hadoop yarn ports are used which are for instance 8088 (the web interface for the ResourceManager Web UI).
For granting external access I ran the following command (contains standard ports for hadoop stuff):
gcutil addfirewall web --network=hadoop-access --allowed=tcp:8088,tcp:50060,tcp:50070,tcp:50075,tcp:50090,tcp:20000,tcp:20001 --project=<project>
Now, after having started hadoop and after having checked that everything works fine:
hadoop@namenode:~$ jps
8057 NameNode
8451 ResourceManager
8164 DataNode
8544 NodeManager
8306 SecondaryNameNode
8835 Jps
I'd like to access the web interface. To do so I open up chrome browsr and enter 108.59.84.14:port where port is for instance 8088. However, none of the ports work. I always get an error message which states that this page does not exist.
What am I doing wrong?
There are a few possibilities; you may want to try using the tcpdump tool to determine which of these cases you've hit. When using tcpdump, remember to restrict to port 8088 with e.g.
tcpdump -vv port 8088.The GCE firewall is blocking the traffic. This could be due to the firewall being applied to the wrong network (as Benson suggests). You could determine this with tcpdump; no traffic should show up on the instance when trying to connect.
Your image may have a Linux firewall that's blocking traffic. For this and the following case, you should see request packets arrive on 8088, followed by either a TCP RST (reset) or no response traffic. In the case of no response, this is almost certainly a Linux firewall.
The hadoop software may only be listening on localhost. In this case, you will see a TCP RST, you can tell the difference between this and the above case by running
netstat -lpn -A inetto list all the listening programs and what address they are listening on.