I am trying to write a python application to save data into hive running on a hadoop cluster. I initially tried to do this with PyHive but came across an issue where I couldn't add more than 2 rows before getting a cryptic error. I later learned that PyHive is no longer supported and am now switching to Impyla. I have written the following code
from impala.dbapi import connect
connection = connect(host='node1', port=10000, database='default', user="hive")
cur = connection.cursor()
Unfortunately, this code generates the following error:
impala.error.HiveServer2Error: Failed after retrying 3 times
In comparison, my PyHive connect statement looked like this and was successfully able to connect and let me run SELECT statements:
connection = hive.connect(host='node1', port=10000, database='default', username='hive')
node1 is the namenode of a hadoop cluster (a virtual cluster consisting of a couple of virtual machines)
I started the ThriftServer on my namenode with the following command:
start-thriftserver.sh --hiveconf hive.server2.thrift.port=10000