How to parametrize the table name using Cassandra's Python driver?

361 Views Asked by At

I have an instance method that is used to construct a Cassandra query, and I'm looking for a way to more safely parametrize the table name in a query without using string formatting.

For example,

def some_method(self, table_name):
    statement = f"SELECT COUNT(*) FROM {self.table_name}"
    results = self.cassandra.session.execute(query=statement)
    ...

I don't believe SimpleStatement or a PreparedStatement can be used to bind a table name.

1

There are 1 best solutions below

0
Gino Mempin On

If you are using DataStax's Python Driver for Cassandra (as of v3.28), there is no way to parametrize or bind the table name using Session.execute / Session.execute_async, because it only expects the column values to be in the parameters argument, and the table name to be hardcoded in the statement string. You'll have to resort to string formatting, which then makes your linters complain about possible SQL injections (which you then have to silence with # noqa and such).

I managed to work around this by using cassandra.cqlengine.models instead which is still part of the same Python driver package: https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cqlengine/models/.

You can dynamically create the model for your table during runtime (during which you can pass the table names from somewhere), and then use that model to perform the queries.

First, define an abstract model:

from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model
from cassandra.cqlengine.management import sync_table

class MyAbstractModel(Model):
    __abstract__ = True

    __keyspace__ = "your_keyspace"

    # The table name will be defined during runtime
    # __table_name__ = ...

    id = columns.Integer(primary_key=True)
    first_name = columns.Text()
    last_name = columns.Text()

Then on runtime, create a concrete model:

def get_table_name() -> str:
    # Return some table name based on some app-specific logic.
    return "Person"

table_name = get_table_name()
table_attrs = {"__table_name__": table_name}
person_model = type(
    table_name,
    (MyAbstractModel,),
    table_attrs,
)

# Not shown here is setting-up the connection to the cluster
# and setting-up a session, as that's not part of the question.
# Assume an active session is already available.
sync_table(model=person_model)

The type call is Python's built-in type function for creating a type, in this case a concrete Person class based on cassandra.cqlengine.models.Model. The arguments to type in order are:

If your code runs successfully, you'll see that the table is created:

cqlsh> DESC TABLE your_keyspace.person;

CREATE TABLE your_keyspace.person (
    id int PRIMARY KEY,
    first_name text,
    last_name text
) WITH ...

cqlsh> SELECT COUNT(*) FROM your_keyspace.person;

 count
-------
     0

Once you've got your table, you can now use that to make your queries as a replacement to using session.execute. Depending on how many tables you'll need to create, you'll probably need a mapping of the table names to their models:

all_models = {
    "person": person_model,
    "food": food_model,
    "drink": drink_model,
    ...
}

Copying the example from the question where the table name is a parameter of some_method:

def some_method(self, table_name):
    model = all_models.get(table_name)
    if not model:
      raise InvalidTableNameException

    row = model(id=1234, first_name="Foo", last_name="Bar")
    row.save()

Since the models are dynamically created, just make sure that

  • It is still a valid Python class name
  • It is still a valid Cassandra table name
  • The __table_name__ and the type name are consistent

Read the documentation on Making Queries with models.