OpenSearch - Bulk inserting Million rows from Pandas dataframe

25 Views Asked by At

1 million rows in my dataframe with around 1500 columns. I need to insert these into a opensearch index.

I used the below code

from opensearchpy import OpenSearch ,RequestsHttpConnection, AWSV4SignerAuth
import opensearch_py_ml as oml
client = OpenSearch(
            hosts = [{'host': "****amazonaws.com", 'port': 443}],
            http_auth = ("user","password"),
            use_ssl = True,
            verify_certs = True,
            timeout=600,
             max_retries=10, 
             retry_on_timeout=True,
            connection_class = RequestsHttpConnection
            )
ml_df = oml.pandas_to_opensearch(df,
                           os_client = client,
                           os_dest_index='opensearch_index',
                           os_if_exists="replace",
                           os_refresh=True)

So I am getting a bulk index error or timeout error. I need a query to insert this into opensearch where I dont want to define the mappings for the index as it has 1500 columns. And I need to change settings increasing the column limit from 1000 to 1500. Any help is highly appreciated.

0

There are 0 best solutions below