Read SQL from AWS Athena with Polars

717 Views Asked by At

I want to read from AWS Athena with polars. Is this possible? Before I used pandas:

import pandas as pd

pd.read_sql(SQL_STATMENT, conn)

I found this User Guide: https://pola-rs.github.io/polars-book/user-guide/howcani/io/read_db.html where Athena is not yet supported.

2

There are 2 best solutions below

0
Dean MacGregor On

The good news is that the doc that you linked isn't the full list of databases that are supported.

polars uses two database connection libraries (or engines):

  1. connectorx

  2. Apache Arrow adbc

The bad news is that neither of those seems to support Athena. For the time being your best bet is probably to continue to use pandas for athena queries and then use pl.from_pandas(...)

0
Filippo Vitale On

To query Athena without passing through pandas to create a Polars DataFrame I would use:

  1. pyathena to an arrow Table object with ArrowCursor
  2. then to polars with an (almost) zero copy operation such as polars.from_arrow

However to have more control over the Athena query implementation I would suggest to have a look at aws-sdk-pandas:

Way more advanced than the example in this question: pd.read_sql