Hive query on HUE shows different timestamp than programatically/on data

16 Views Asked by At

I have an external columnar (GZip-compressed Parquet files on an S3 bucket) Hive (2.1.1) table with a timestamp column.

Visualizing the parquet file with ParquetViewer, I get the following record (censored a column due to sensitive data):

Censored for sensitive data

If I were to SELECT for this specific record in HUE, I get the following result instead:

SELECT end_dt_ym, pod, end_dt_d, bill_ordr, period_start FROM <table> WHERE pod='<sensitive data>' AND end_dt_ym='202202'

enter image description here

It is one hour after the actual parquet-stored data.

What could be causing this issue? How would I fix this? I appreciate any help.

I've tried casting to string using date_format(), but still no dice.

I also tried to mess around with timezones but apparently Hive already defaults to UTC anyway.

0

There are 0 best solutions below