GCP Dataflow Batch MongoDB to BigQuery error schema does not match field changed type from TIMESTAMP to STRING

28 Views Asked by At

I'm having an issue using the Dataflow Batch Template from Mongo DB to BigQuery. I'm always getting an error when trying to write to BigQuery. The error is:

Provided Schema does not match Table doji-stg:store_tradein_db.tradeIn. Field created has changed type from TIMESTAMP to STRING

The tradeIn table is already created in BigQuery with the following schema:

tradeIn table schema

I'm using an UDF to select the needed fields. Here is the UDF function:

function process(inJson) {
  var obj = JSON.parse(inJson);

  var newObj = {};
  newObj._id = obj._id["$oid"];
  newObj.storeId = obj.storeId;
  newObj.gradeSurveyId = obj.gradeSurveyId;
  createdDate = new Date(obj.created["$date"]);
  newObj.created = createdDate.toISOString();
  updatedDate = new Date(obj.updated["$date"]);
  newObj.updated = updatedDate.toISOString();
  newObj.customer = {
    name: obj.customer.name,
    email: obj.customer.email,
    phone: obj.customer.phone,
    documentType: obj.customer.document.type,
    documentValue: obj.customer.document.value
  };

  print("New TradeIn: " + JSON.stringify(newObj));

  // Example data transformations:
  // Add a field: obj.newField = 1;
  // Modify a field: obj.existingField = '';
  // Filter a record: return null;

  return JSON.stringify(newObj);
}

And here is what the print statement is printing:

New TradeIn: {"_id":"65c38a2ce49e7e0ebef3ce1a","storeId":"653c23d2305be9703ab85790","gradeSurveyId":"65c38a124d995105d4bf2dce","created":"2024-02-07T13:48:28.272Z","updated":"2024-02-07T13:48:28.768Z","customer":{"name":"teste teste","email":"[email protected]","phone":"(11) 98989-8989","documentType":"CPF","documentValue":"152.207.360-43"}}

I have no idea why this error is being thrown, because I've created a json file with the json above and imported into BigQuery manually and it was successfully imported with no error. The created field is in the correct ISO format, so it should be able to be imported as a TIMESTAMP and not a STRING.

Any help is much appreciated.

0

There are 0 best solutions below