pandas df.to_sql if column value exits replace or update row

1k Views Asked by At

I'm using pandas df.to_sql to inserting rows to postgresql database.

df.to_sql('example_table', engine, if_exists='replace',index=False)

example_table has 3 columns :'id' ,'name' ,'datetime'

I want to add a checking logic before inserting ,that if the datetime is already exits ,then replace or update the exiting row.

Is there something like:

df.to_sql('example_table', engine, if_ datetime_exists='replace',index=False)
2

There are 2 best solutions below

0
syed asad ali On

There is no if_exist sql function. Try this instead:

# Create a DataFrame with the rows you want to insert
df_to_insert = pd.DataFrame({'name': ['Alice', 'Bob', 'Charlie'], 'datetime': ['2022-01-01', '2022-01-02', '2022-01-03']})

# Read the existing rows from the database into a DataFrame
df_existing = pd.read_sql_query('SELECT * FROM example_table', engine)

# Merge the two DataFrames, using the "datetime" column as the key
df_merged = pd.merge(df_to_insert, df_existing, on='datetime', how='left')

# Replace the values in the merged DataFrame with the values from the to_insert DataFrame
# where the "datetime" column is null (indicating that it is a new row)
df_merged.loc[df_merged['datetime'].isnull(), ['name', 'datetime']] = df_to_insert.values

# Write the merged DataFrame to the database
df_merged.to_sql('example_table', engine, if_exists='replace', index=False)

This will insert datetime in db if not already there also will update existing rows in case of missing datetime.

0
snhou On

if_exists refers to TABLES, not to rows.

df.to_sql('example_table', engine, if_exists='append', index=False)

replace: If table exists, drop and recreate it and insert data.

append: If table exists, insert data. If does not exist, create it.


The easiest way is just delete data and insert new data, example below:

metadata = MetaData(engine)
datetime_list = df['datetime'].values.tolist()
your_table = Table('{your table name}', metadata, autoload=True)
your_table.delete().where(your_table.c.datetime.in_(datetime_list)).execute()

After deleting data, you can insert data as usual


Another way is to use on_conflict_do_update, but it depends on PK, you can find some examples on this post.