How to remove parent directories from a filename in a dataframe?

48 Views Asked by At

I have created a dataframe which contains 2 columns "codes" and "filename". The file names were populated along with the directory path name e.g. ".\XML files\ABC123.xml". How do I remove ".\XML files\" to only leave "ABC123.xml"?

I have tried

#1

df["filename"] = df["filename"].str.replace(".\XML files\","",regex=False")

errors with EOL while scanning string literal

#2

df["filename"] = df["filename"].str.replace("".\XML files\"","",regex=False")

unexpected character after line continuation character

#3

df["filename"] = df["filename"].str.replace(""".\XML files\""","", regex=False")

errors with EOL while scanning triple- quoted string literal

#4

df["filename"] = df["filename"].str.replace("XML files","", regex=False")

this works but leaves me with ".\\ABC123.xml"

after trying another iteration to remove \

df["filename"] = df["filename"].str.replace("\\","", regex=False")

I am still left with ".ABC123.xml"

I am stuck here. how do I remove only the first "." and not the second one?

1

There are 1 best solutions below

2
C4stor On

You need to escape the \ in the replacement, so for example df["filename"] = df["filename"].str.replace(".\\XML files","",regex=False")