I have created a dataframe which contains 2 columns "codes" and "filename". The file names were populated along with the directory path name e.g. ".\XML files\ABC123.xml". How do I remove ".\XML files\" to only leave "ABC123.xml"?
I have tried
#1
df["filename"] = df["filename"].str.replace(".\XML files\","",regex=False")
errors with EOL while scanning string literal
#2
df["filename"] = df["filename"].str.replace("".\XML files\"","",regex=False")
unexpected character after line continuation character
#3
df["filename"] = df["filename"].str.replace(""".\XML files\""","", regex=False")
errors with EOL while scanning triple- quoted string literal
#4
df["filename"] = df["filename"].str.replace("XML files","", regex=False")
this works but leaves me with ".\\ABC123.xml"
after trying another iteration to remove \
df["filename"] = df["filename"].str.replace("\\","", regex=False")
I am still left with ".ABC123.xml"
I am stuck here. how do I remove only the first "." and not the second one?
You need to escape the \ in the replacement, so for example
df["filename"] = df["filename"].str.replace(".\\XML files","",regex=False")