I'm attempting to create an empty SQL table for a CSV file. I have also converted the dataframe datatypes to SQL datatypes. However, the date datatype has been converted into a float datatype.
import pandas as pd
def generate_table_creation_query(csv_file_path, table_name):
df = pd.read_csv(csv_file_path)
sql_data_types = {
'int64': 'INT',
'float64': 'FLOAT',
'object': 'NVARCHAR(MAX)',
'datetime64': 'DATETIME'
}
column_definitions = []
for column_name, dtype in df.dtypes.items():
column_type = sql_data_types.get(str(dtype), 'NVARCHAR(MAX)')
column_definitions.append(f'{column_name} {column_type}')
columns_str = ',\n'.join(column_definitions)
table_creation_query = f'''
CREATE TABLE {table_name} (
{columns_str}
)
'''
return table_creation_query
# Example usage:
csv_file_path = 'transactions.csv'
table_name = 'transactions'
sql_query = generate_table_creation_query(csv_file_path, table_name)
print(sql_query)
The code returns the ENDDATE as a float data type. Here is the resulting output:
CREATE TABLE transactions (
CUST_ID NVARCHAR(MAX),
START_DATE NVARCHAR(MAX),
**END_DATE FLOAT,**
TRANS_ID NVARCHAR(MAX),
DATE NVARCHAR(MAX),
YEAR INT,
MONTH INT,
DAY INT,
EXP_TYPE NVARCHAR(MAX),
AMOUNT FLOAT
)
so how I can correctly convert all data types?
when the column's type is datetime before creating the table, use to_datetime to convert datatype.