I have created a 'kind of' table from a list of file paths and filenames, but I think that the way I did was not the best. I would like to ask you how my code should be, and also how I could actually create a dataframe with headers linking this code to pandas.
My code:
import os
current_dir = os.getcwd()
files_in_dir = os.listdir(current_dir)
for file_name in files_in_dir:
file_path = os.path.join(current_dir, file_name)
if "R1" in file_path:
sample = file_name.split('_')[0]
file_path2 = file_path.replace("R1", "R2")
print(sample + '\t' + file_path + '\t' + file_path2)
output (not a proper table):
AG16 /home/user/folder1/AG16_R1.fastq /home/user/folder1/AG16_R2.fastq<p>
AG13 /home/user/folder1/AG13_R1.fastq /home/user/folder1/AG13_R2.fastq<p>
AG2 /home/user/folder1/AG2_R1.fastq /home/user/folder1/AG2_R2.fastq<p>
...<p>
What I would like to add:
code above plus:
import pandas as pd
df = pd.DataFrame()
data = [sample, file_path, file_path2]
df = pd.DataFrame(data, columns=['sample', 'r1', 'r2'])
print(df)
Thank you very much!
Try this: