Unable to read current version of delta table in azure ml studio using Data asset

90 Views Asked by At

I am trying to create data assest with ADLS gen 2, and read a delta table on adls gen folder something like this:

/
└── my-data
    ├── _delta_log
    ├── part-0000-xxx.parquet
    └── part-0001-xxx.parquet

Currently, when creating the data asset I used file dataset type ML v1 APIs, but when reading the table, it shows all the rows(even the deleted ones), and not the most recent version.

I have attempted to create it all the other data asset types for azure Ml v1/v2. I ideally want to read the most recent version of the delta table and also have the option to change version.

No sucess. How to resolve this?

2

There are 2 best solutions below

0
2OG On BEST ANSWER

For the below code to work, you need to create a mltable(data asset) with correct folder path.

import time
   import mltable
   from azure.ai.ml import MLClient
   from azure.identity import DefaultAzureCredential
   current_timestamp = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
   ml_client = MLClient.from_config(credential=DefaultAzureCredential())
   data_asset = ml_client.data.get("<enter your ml table name>", version="1")
 
   tbl = mltable.from_delta_lake(delta_table_uri=data_asset.path, 
   timestamp_as_of=current_timestamp)
   df = tbl.to_pandas_dataframe()
   df
3
Bhavani On

You can follow the procedure below to read the current version of the Delta table:

Add the Storage Blob Data Contributor role to your Entra ID, where you created the ML workspace to ADLS account. Run the code below to read the current version of the Delta Lake table:

import time
import mltable
current_timestamp = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
path = "abfss://<containerName>@<storageAccountName>.dfs.core.windows.net/<deltaTablePath>"
tbl = mltable.from_delta_lake(delta_table_uri=path, timestamp_as_of=current_timestamp)
df = tbl.to_pandas_dataframe()
df

You will see the Delta table as shown below:

enter image description here

For more information, you can refer to this.