AttributeError: 'pyarrow.lib.Table' object has no attribute 'to_reader'

381 Views Asked by At

I'm trying to run the LLM model in Databricks (Microsoft Azure) Python for tabular data. When I run the code in Jupiter I don't get any error but when I run it in databrikcs I get the error:

AttributeError: 'pyarrow.lib.Table' object has no attribute 'to_reader'

The code is:

from arize.pandas.embeddings.tabular_generators import EmbeddingGeneratorForTabularFeatures
import arize.pandas.embeddings.base_generators

# EmbeddingGeneratorForTabularFeatures.list_pretrained_models()

generator = EmbeddingGeneratorForTabularFeatures(
    model_name="distilbert-base-uncased",
    tokenizer_max_length=512,
    #, dropout=0                                                   # Remove Drop-out
)
tabular_vector_columns = []  # list of tabular vectors 
prompt_columns         = []  # list of prompt columns  

# Iterate over each column_set
for i in range(split_prompt_n):
    tab_vec_col_name_i = 'tabular_vector_' + str(i)
    prompt_col_name_i = 'prompts_' + str(i)
    tabular_vector_columns += [tab_vec_col_name_i] 
    prompt_columns += [prompt_col_name_i]

  # train_X
    train_X[tab_vec_col_name_i ], train_X[prompt_col_name_i] = generator.generate_embeddings(
          train_X,
          selected_columns  = cols_per[str(i)],
          return_prompt_col = True
      )

  # test_X 
    test_X[tab_vec_col_name_i], test_X[prompt_col_name_i] = generator.generate_embeddings(
    test_X,
    selected_columns  = cols_per[str(i)],
    return_prompt_col = True
  )

At the line of train_X in the loop, I get the error. I didn't find any solution to it.

0

There are 0 best solutions below