I can successfully connect to SQL server database with python script, execute the query and store the output in a dataframe, however, there are few columns with varbinary(max) datatype in the DB which do not return the column in the right format in the dataframe. Any pointers in the right direction will be appreciated.
I am using python 3.6
Example below (it's representative, not the exact values):
If I run the query in sql server for varbinary column, it returns value in following format (Hex):
0x464D520020323000000000E30000012001400
But value stored in the dataframe is in byte: b'E"\x11\x11\x99\x00UF\n\x80A\x91\x87.\x81\xa1\x80\x08\x10\xc1\xb17\x03#B\xe0\x0f\x1a\x1f\x03\x80i"\x17\xc3\x91B\x0b.\xc4Ax\x06"\xc4a\x07\x9c\x11\x84\xd1=\x88\x1e\x851\xab\x91\x04EA\x98\x88\r\x85A\x95\x8c\x0b\x05\x91\x97\x8b/\x05\xf1\x17\x83\x07\x86\xb0?\x85,G\x91\x13\x05\x1f\x08\x01Z\x0c'\xc8\xa0l\x8a\x01H\xe1A\x84\x04\t\xb1\x9b\x82\x1fJ\x81Z\x8a!\x8bQ[\x06\x1aK\xb1\xa9\x07\x06M\xb0G\x86;\xcd\xb0\x80\t?
Representative code below:
import pyodbc
import pandas as pd
cn=pyodbc.connect() #details of DB not included here
cr=cn.cursor()
query="""select * from tablename""" #repsentative query
DF=pd.read_sql(query,cn)
Following the below link it looks like varbinary(max) is converted to byte datatype while importing using python, in this case what is the best way to preserver the original value imported from sql server?
I have found the solution, in the sql query perform convert to varchar(max) in select query and it returns the expected output. Only downside is to specify all the columns instead of using "*".
Thanks to those who took the time to respond, it was helpful.