I try to replace in several files some patterns by other patterns. For example my infile looks like this:
>Genus_species_SRR13259292|ENSG00000000457_ENST00000367772
TACGCCGCGCACTTCACGCGAGAGCAGCTGCGCACTATCGTCCTGCCCCAGGTGCTGCTGGGCCTGCGAGACACCAGCACCCCCATCGTGGCCATCACCCTGCACAGCCTCGCCGTGCTGGTCTCCCTGCTCGGACCAGAGGTGGTTGTGGGCGGAGAAAGAACCAAGATCTTCAAACGCACTGCCCCCAGCTTTACAAAAACCACTGACCTCTCCCCAGAAGAC
and I want output:
>Genus_species_Something_something|ENSG00000000457_ENST00000367772
TACGCCGCGCACTTCACGCGAGAGCAGCTGCGCACTATCGTCCTGCCCCAGGTGCTGCTGGGCCTGCGAGACACCAGCACCCCCATCGTGGCCATCACCCTGCACAGCCTCGCCGTGCTGGTCTCCCTGCTCGGACCAGAGGTGGTTGTGGGCGGAGAAAGAACCAAGATCTTCAAACGCACTGCCCCCAGCTTTACAAAAACCACTGACCTCTCCCCAGAAGAC
I have two list files, my old patterns:
Genus_species_SRR13259292
and new patterns:
Genus_species_Something_something
I tried to do this with sed. Here is my command:
while IFS= read -r line1 && IFS= read -r line2 <&3; do
for f in *.fasta; do
sed -e "s/${line1}/${line2}/g" "$f" > "${f%.fasta}_NewName.fasta"
done
done < "List_oldpattern.txt" 3<"List_newpatterns.txt"
But this doesn't work, maybe it is because of the > and | delimited the pattern?
If sed doesn't work it may be possible with Awk?
Since the question has been tagged with
awkI propose we replace all of OP's current code with a singleawkscript ...My sample
.fastafiles:NOTE: files do not contain the comments
We'll make use of the
pastecommand to append OP's old and new patterns into a single line; we'll use a|as the delimiter:Now the
awkscript:NOTE: assuming OP has more than one old/new pattern pair, this script has the added benefit of only scanning each
*.fastafile once (as opposed to OP's currentwhile/read/for/sedloop which scans each.fastafileNtimes - whereNis the number of old/new pattern pairs)This generates: