I'm encountering an issue while trying to connect Spark to an H2 database for managing metadata.
However, I'm facing difficulties with mapping the LONGVARCHAR JDBC type for a particular field (viewExpandedText) in the org.apache.hadoop.hive.metastore.model.MTable class.
Error Message:
24/03/29 00:03:20 WARN Schema: Exception when trying to get default schema name for datastore
org.h2.jdbc.JdbcSQLSyntaxErrorException: Column "IS_DEFAULT" not found [42122-224]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:514)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:489)
at org.h2.message.DbException.get(DbException.java:223)
24/03/29 00:03:20 WARN Query: Query for candidates of org.apache.hadoop.hive.metastore.model.MTableColumnStatistics and subclasses resulted in no possible candidates
Failed to generate new Mapping of type org.datanucleus.store.rdbms.mapping.java.StringMapping, exception : JDBC type LONGVARCHAR declared for field "org.apache.hadoop.hive.metastore.model.MTable.viewExpandedText" of java type java.lang.String cant be mapped for this datastore.
JDBC type LONGVARCHAR declared for field "org.apache.hadoop.hive.metastore.model.MTable.viewExpandedText" of java type java.lang.String cant be mapped for this datastore.
org.datanucleus.exceptions.NucleusException: JDBC type LONGVARCHAR declared for field "org.apache.hadoop.hive.metastore.model.MTable.viewExpandedText" of java type java.lang.String cant be mapped for this datastore.
at org.datanucleus.store.rdbms.mapping.RDBMSMappingManager.getDatastoreMappingClass(RDBMSMappingManager.java:1386)
I'm executing the following Spark command to initiate the session:
pyspark
--packages io.delta:delta-spark_2.12:3.1.0 \
--jars /opt/spark/jars/h2-2.2.224.jar \
--conf spark.hadoop.datanucleus.fixedDatastore=true \
--conf spark.hadoop.datanucleus.autoCreateSchema=true \
--conf spark.hadoop.datanucleus.schema.autoCreateTables=true \
--conf spark.hadoop.javax.jdo.option.ConnectionURL=jdbc:h2:mem:test;MODE=MSSQLServer;USER=sa;DB_CLOSE_DELAY=-1 \
--conf spark.hadoop.javax.jdo.option.ConnectionDriverName=org.h2.Driver \
--conf spark.hadoop.javax.jdo.option.ConnectionUserName=sa \
--conf spark.hadoop.javax.jdo.option.ConnectionPassword="" \
--conf spark.hadoop.hive.metastore.warehouse.dir=/tmp
Request for Assistance: I'm seeking assistance in resolving this mapping issue and successfully connecting Spark to the H2 database for managing metadata. Any insights, suggestions, or alternative approaches to tackle this problem would be greatly appreciated.
Thank you in advance for your help!
Approaches Tried:
Checked DataNucleus configuration and verified relevant settings (
spark.hadoop.datanucleus.fixedDatastore,spark.hadoop.datanucleus.autoCreateSchema, etc.).Updated to the latest version of H2 database.
When i try to create the database directly with schematool i get this error
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
Underlying cause: java.lang.ClassNotFoundException : org.h2.Driver
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
at org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:97)
at org.apache.hive.beeline.HiveSchemaTool.getConnectionToMetastore(HiveSchemaTool.java:169)
at org.apache.hive.beeline.HiveSchemaTool.testConnectionToMetastore(HiveSchemaTool.java:475)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:581)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:567)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1517)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.base/java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.util.RunJar.run(RunJar.java:330)
at org.apache.hadoop.util.RunJar.main(RunJar.java:245)
Caused by: java.lang.ClassNotFoundException: org.h2.Driver