I have the following table - note there is a composite primary key consisting of 5 columns.
test_ts = Table('test_ts', meta,
Column('metric_id', ForeignKey('metric.id'), primary_key = True),
Column('entity_id', Integer, primary_key = True),
Column('date', DateTime, primary_key = True),
Column('freq', String, primary_key = True),
Column('context', String, primary_key = True, server_default=""),
Column('value', String),
Column("update_time", DateTime, server_default=func.now(), onupdate=func.current_timestamp()),
Column('update_by', String, server_default=func.current_user()))
I am trying to bulk insert some data into this POSTGRES table, but where the 'context' column may be an empty string (''):
date context value freq entity_id metric_id
1 1999-02-01T05:00:00.000Z test 101 D 1105 4
2 1999-02-02T05:00:00.000Z test 102 D 1105 4
8 1999-02-01T05:00:00.000Z 201 D 1105 4
9 1999-02-02T05:00:00.000Z 202 D 1105 4
When trying to do this however, i get the following error:
Traceback (most recent call last):
File "/workspaces/service/data_svc.py", line 121, in bulk_insert
cur.execute(sql_insert)
File "/usr/local/lib/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
raise ex.with_traceback(None)
psycopg.errors.NotNullViolation: null value in column "context" of relation "test_ts" violates not-null constraint
DETAIL: Failing row contains (4, 1105, 1999-02-01 05:00:00, D, null, 201, 2024-03-27 21:23:42.778517, db_user).
You can see above, the row value contains a 'null' when it should be an empty string. I am not sure why the empty string is getting converted to null here - as it's not None or Nan, but explictly set to '' in my pandas dataframe.
The SQL im using are the following:
sql_create:
drop table if exists tmp_tbl; CREATE UNLOGGED TABLE tmp_tbl AS SELECT date,context,value,freq,entity_id,metric_id FROM test_ts LIMIT 0
sql_copy:
COPY tmp_tbl (date,context,value,freq,entity_id,metric_id) FROM STDIN (FORMAT CSV, DELIMITER "\t")
sql_insert:
insert into test_ts(date,context,value,freq,entity_id,metric_id)
select * from tmp_tbl on conflict(metric_id,entity_id,date,freq,context)
do update set value = EXCLUDED.value;drop table if exists tmp_tbl;
and the code snippet looks like this:
with psycopg.connect(self.connect_str, autocommit=True) as conn:
io_buf = io.StringIO()
df.to_csv(io_buf, sep='\t', header=False, index=False)
io_buf.seek(0)
with conn.cursor() as cur:
cur.execute(sql_create)
with cur.copy(sql_copy) as copy:
while data:=io_buf.read(self.batch_size):
copy.write(data)
cur.execute(sql_insert)
conn.commit()
I can work around this by using something like 'N/A' instead of ''...but would much prefer it to be an empty string. Any help is appreciated.