How to merge all csv files in a specific folder using python and os

4.4k Views Asked by At

How to merge all csv files in a specific folder using os.

So the code bellow does the work but it concat all the files in the same directory where the script lives.

How do utilized it in different folders?

My code:

import os
import pandas as pd
import numpy as np



def get_df():
    df=pd.DataFrame()
    for file in os.listdir():
        if file.endswith('.csv'):
            aux=pd.read_csv(file, error_bad_lines=False)
            df=df.append(aux)
    return df


df=get_df()


df.to_csv(f"file_name.csv")

I have tried adding folder name, but doesn't find the files.

for file in os.listdir('My_folder_name\'):

2

There are 2 best solutions below

1
Peter On BEST ANSWER

I found at os module documentation a code that changes the directory os.chdir("C:\\Users\\Desktop\\my_folder_name\\my_new_folder_name").

https://www.tutorialsteacher.com/python/os-module

So I just add this line before the loop and now it works!!

def get_df():
    df=pd.DataFrame()
    os.chdir("C:\\Users\\Desktop\\my_folder_name\\my_new_folder_name")
    for file in os.listdir():
        if file.endswith('.csv'):
            aux=pd.read_csv(file, error_bad_lines=False)
            df=df.append(aux)
    return df


df=get_df()

df.to_csv(f"file_name.csv")
0
Gonçalo Peres On

There are various ways to solve it, depending on the type of merge one wants to do.

Considering your specific requirements (Python and os), and assuming one wants to concat the files, the following will do the work (including for files with the same header)

import os

os.system("awk '(NR == 1) || (FNR > 1)' file*.csv > merged.csv")

Where NR and FNR represent the number of the line being processed.

FNR is the current line within each file.

NR == 1 includes the first line of the first file (the header), while (FNR > 1) skips the first line of each subsequent file.