I have 2 csv and text files, file 1 has 2 columns one with gene id and two with gene name,file 2 has many columns with part of the string in columns being gene id e.g gene id(genome) or pseudo gene id(genome). i want to compare each gene id in file 1 with each gene id in file 2 and replace the gene id in file 2 with the gene name in file 1 printing in file 3.
file 1;
SPAR5_0024, coA binding domain protein
SPAR5_0025, hypothetical protein
SPAR5_0026, hypothetical protein
file 2;
SPAR5_0024(72.AFAX01.1.gb) SPAR5_0026(72.AFAX01.1.gbff) SPAR5_0025(72.AFAX01.1.gbff)
desired output (file 3);
coA binding domain protein(72.AFAX01.1.gb) hypothetical protein(72.AFAX01.1.gbff) hypothetical protein(72.AFAX01.1.gbff)
with my code am getting an empty file 3
This is what am running;
#!/usr/local/bin/perl -w
use strict;
use warnings;
my $file1 = "annot.txt";
my $file2 = "orthomcl.csv";
my $file3 = "combi.csv";
open (FILE1,"$file1") || die;
open (FILE2,"$file2") || die;
open (FILE3,">$file3") || die;
my @file1 = <FILE1>;
my @file2 = <FILE2>;
my %file1;
while ( my $value = <FILE1> ) {
chomp $value;
my @file1 = split /\s+/, $_;
$file1{$value} = 1;
}
my %file2;
while (my $value = <FILE2>) {
chomp $value;
my @file2 = split /\s+/, $_;
if ( $file1{ $value } ) {
$file2 = $file1{ $file2 };
print join( "\t" => @file2 ), $/;
}
}
close (FILE1);
close (FILE2);
close (FILE3);
desired output (file 3)
coA binding domain protein(72.AFAX01.1.gb) hypothetical protein(72.AFAX01.1.gbff) hypothetical protein(72.AFAX01.1.gbff)
The primary error is that
consume all data from the files, so that there's nothing left to read for the
and