Cast to long datatype - BigQuery

1.2k Views Asked by At

BigQuery and SQL noob here. I was going through possible data types big query supports here. I have a column in bigtable which is of type bytes and its original data type is scala Long. This was converted to bytes and stored in bigtable from my application code. I am trying to do CAST(itemId AS integer) (where itemId is the column name) in the BigQuery UI but the output of CAST(itemId AS integer) is 0 instead of actual value. I have no idea how to do this. If someone could point me in the right direction then I would greatly appreciate it.

EDIT: Adding more details
Sample itemId is 190007788462

Following is the code which writes itemId to the big table. I have included the relevant method. Using hbase client to write to bigtable.

import org.apache.hadoop.hbase.client._
def toPut(key: String, itemId: Long): Put = {
   val TrxColumnFamily = Bytes.toBytes("trx")
   val ItemIdColumn = Bytes.toBytes("itemId")
   new Put(Bytes.toBytes(key))
  .addColumn(TrxColumnFamily,
             ItemIdColumn,
             Bytes.toBytes(itemId))

}

Following is the entry in big table based on above code

ROW         COLUMN+CELL

foo         column=trx:itemId, value=\x00\x00\x00\xAFP]F\xAA

Following is the relevant code which reads the entry from big table in scala. This works correctly. Result is a org.apache.hadoop.hbase.client.Result

private def getItemId(row: Result): Long = {
  val key = Bytes.toString(row.getRow)
  val TrxColumnFamily = Bytes.toBytes("trx")
  val ItemIdColumn = Bytes.toBytes("itemId")

  val itemId =
    Bytes.toLong(row.getValue(TrxColumnFamily, ItemIdColumn))
  itemId

}

The getItemId function above correctly returns itemId. That's because Bytes.toLong is part of org.apache.hadoop.hbase.util.Bytes which correctly casts the Byte string to Long.

I am using big query UI similar to this one and using CAST(itemId AS integer) because BigQuery doesn't have a Long data type. This incorrectly casts the itemId byte string to integer and resulting value is 0.

Is there any way I can have a Bytes.toLong equivalent from hbase-client in BigQuery UI? If not is there any other way I can go about this issue?

1

There are 1 best solutions below

5
Elliott Brossard On BEST ANSWER

Try this:

SELECT CAST(CONCAT('0x', TO_HEX(itemId)) AS INT64) AS itemId
FROM YourTable;

It converts the bytes into a hex string, then casts that string into an INT64. Note that the query uses standard SQL, as opposed to legacy SQL. If you want to try it with some sample data, you can run this query:

WITH `YourTable` AS (
  SELECT b'\x00\x00\x00\xAFP]F\xAA' AS itemId UNION ALL
  SELECT b'\xFA\x45\x99\x61'
)
SELECT CAST(CONCAT('0x', TO_HEX(itemId)) AS INT64) AS itemId
FROM YourTable;