Error with IbisTypeError using row validation from professional-services-data-validator tool

31 Views Asked by At

So I'm using an open-source tool to compare the row-by-row data between 2 tables sitting in 2 different databases in in MSSQL and the other in Snowflake.

The connection setups for both were fine. I could run the row count comparison successfully and got the count using the cmd below:

data-validation validate column -sc snowflake -tc snowflake -tbls MY_SCH.MY_TABLE

However I ran into issues when running the row-by-row comparison. For a test, I am comparing a snowflake table to itself. The cmd is as below:

data-validation validate row -sc snowflake -tc snowflake -tbls MY_SCH.MY_TABLE --primary-keys MY_PRIMARY_KEY --hash '*'

Where:

  • row: type of data validation,
  • -sc: source connection
  • -tc: target connection
  • tbls: table name(s)
  • --primary-keys: the primary key column being used to match between the 2 tables
  • --hash: * for validating all fields or a list of field names for a selected list only

This error message followed , I've tried comparing different tables, selected columns, using SQL Server database. All came out with the same error:

  File "C:\Users\admin\AppData\Local\Programs\Python\Python310\lib\site-packages\ibis\common\validators.py", line 467, in sequence_of
    raise IbisTypeError(f'Arg must have at least {min_length} number of elements')
ibis.common.exceptions.IbisTypeError: Arg must have at least 1 number of elements

The Github repo for the data validation tool is here: https://github.com/GoogleCloudPlatform/professional-services-data-validator

Appreciate any suggestions!

1

There are 1 best solutions below

0
Ray Hector On

The answer is as below:

data-validation validate row -sc snowflake -tc snowflake -tbls MY_SCH.MY_TABLE=MY_SCH.MY_TABLE --primary-keys MY_PRIMARY_KEY --hash *

The key is to follow the syntax of for parameter -tbls MY_SCH.MY_TABLE=MY_SCH.MY_TABLE and remove all quotations.