Why is query via SQLAlchemy session slower on first run

37 Views Asked by At

I am profiling an SQLAalchemy repository method to fetch orders as of a certain date

def test_profile_get_orders_asof():
    for i in range(1, 11):
        with DBContextManager() as session:
            data_repo = SqlAlchemyRepository(session)
            sut = data_repo.get_orders_asof
            sut = profile(
                sut, immediate=True, filename=f"get_orders_asof_{i}.prof"
            )
            sut(datetime(2022, 12, 1, tzinfo=pytz.utc))

And the total execution time is always higher for the first call of the method

[1.3190752999999988,
 0.2215163999999999,
 0.25437130000000036,
 0.19252480000000013,
 0.1886178,
 0.22220109999999996,
 0.21765510000000007,
 0.2034394999999999,
 0.19447940000000002,
 0.20421450000000013]

The method data_repo.get_orders_asof like something like this

from datetime import datetime
import pandas as pd

def get_orders_asof(self, date: datetime) -> pd.DataFrame:
    asof_date = date.strftime("%Y-%m-%d")
    column_types = {
        "id": "int",
        "quantity": "int",
        "price": "float",
        "date": "str"
    }
    return pd.read_sql(
        "SELECT * FROM Orders WHERE order_date <= ?", 
        con=self.session.get_bind(), params=(asof_date,), dtype=column_types
    )

and is executed against an SQL Server database. Why is the first call so much slower than the subsequent calls? Is the database caching anything?

0

There are 0 best solutions below