Unable to resolve symbol: >Chr01 in this context

71 Views Asked by At

I have a FASTA file:

> cat test.fasta
>Chr01
TTGTTGGAGGCGGAAGATTCGTCTTCAACAGTGATATAATTGATATTCGACCTTCTTATGGAAATGCATACTAGTGACAC
TTGCTTGTATCCTAAAATTTCACTTGGTTGCCTTGTTGCAGCATTTGGTGGGAGATTGCCTAACTTTGATGGCTCATTGG
GATTGTATCCAACTTTTAGAAATATCTTGTAATCGTTAGGGTCAAAATCTTCATCTATTGGCTTTGTAGGGAGTGCCCCA
TTCGATAAAGGATTTTGGGCTACAAACCCTATAAATGGCTTAGGGGAAAACTTTACTGCCTCAATTCGTTTTACCTGAAG
>Chr02
AGTTAGATCCTTTAGCGTGTTAGTTTGGAGATTAGAAGATTCGCCCCCGTCTTTCTTCCTTTTAGTGACATATTGGAGCG
GGGAAACTTACTTTCTTTCCATAAGATACAATACCCCCTTTATGAGATTTATTCAGGTTGGTGTGTACCTCCTTAGTAAT
AGGTTTGGCTTCATCAGTAGCCACCTCTGCTCTTTTTGTTATGGGATCTTCATTATTGCTTTTCATGCTATCATCATCTT
TTAGCTCTTTCACAATGCAATTCTTTAAGTATAACTTTGCATCGGCGAAGTGTGACTCAGCCTCGGTGAATGGCTCATCA
TCAGCAAATATCTTCTTCTCGACTTTACCGTCGTAGTATTTAGACATTGATGGTAGGTAGATGGAACTACTTTATTCTCA
TGTATCCAAGGCCTTCCAAGAAAGACGTTGTAAGAAGACTTTGCATCGATCGCATGTAGCCATGCACTTGATTTCATATC
>Scaf1
TTCAATGGTAATTTCCAATCTGATCGCACCTATGGCCCTTTGGCCCCCCTTGGTTGAATCGTTGGATCATCATACGACTT
TCTGAGAGTTCGTTCATGGGAATGCAAACTTTTTTCACAGTTCGAATTGGCAAAATGTTCGCTGAGGATCCTCCATCAAC
CAAAATTTGATTTACCCTTTCATCACGCATATAGCCAACTAGGTAAAATGGGTGGTTACAAAGAGTGTCACCTAGCAGAA
GATCGTCATTTATGAACATGACTTTTTCTTCCCAACTATTAACTTGTTGAGGAGTGGAATCGATGAGCTTTTCTGGAGAT
>Scaf2
AGTTCTAATGGTAGGTCATCACTCTTTTCTTCCCTTTTGTCTTCATGGTAACAAGAGGCATCGATTCCCTCATGGGAAAT
CTTCGTGCGGAACCAAGATGGAAAGAAATGCTCCAAGGTTATTGTATGTCGTGGCTTTTGAGGGTGGTGCATCTCCACTT
TTTCTTTCTTCAAAAGCCTAACAAGATTCTTTTCTGTAGGTCTTTTTACTGTTATTTTCCTTGTTGGTTGTTCTATTGAT
TCTTTTTCTAGGATCCTTTTGTGGTGCCTACGACAAGTCACCAACGTCCAACCTTCATTATCATCCGATTAATCGTCTCC

I would like to make the above FASTA file compatible to PanSN-spec specification:

[sample_name][delim][haplotype_id][delim][contig_or_scaffold_name]

with

sample_name := string
delim := #
haplotype_id := number
contig_or_scaffold_name := string

The output compatible to PanSN-spec FASTA file:

>test1#1#Chr01
TTGTTGGAGGCGGAAGATTCGTCTTCAACAGTGATATAATTGATATTCGACCTTCTTATGGAAATGCATACTAGTGACAC
TTGCTTGTATCCTAAAATTTCACTTGGTTGCCTTGTTGCAGCATTTGGTGGGAGATTGCCTAACTTTGATGGCTCATTGG
GATTGTATCCAACTTTTAGAAATATCTTGTAATCGTTAGGGTCAAAATCTTCATCTATTGGCTTTGTAGGGAGTGCCCCA
TTCGATAAAGGATTTTGGGCTACAAACCCTATAAATGGCTTAGGGGAAAACTTTACTGCCTCAATTCGTTTTACCTGAAG
>test1#1#Chr02
AGTTAGATCCTTTAGCGTGTTAGTTTGGAGATTAGAAGATTCGCCCCCGTCTTTCTTCCTTTTAGTGACATATTGGAGCG
GGGAAACTTACTTTCTTTCCATAAGATACAATACCCCCTTTATGAGATTTATTCAGGTTGGTGTGTACCTCCTTAGTAAT
AGGTTTGGCTTCATCAGTAGCCACCTCTGCTCTTTTTGTTATGGGATCTTCATTATTGCTTTTCATGCTATCATCATCTT
TTAGCTCTTTCACAATGCAATTCTTTAAGTATAACTTTGCATCGGCGAAGTGTGACTCAGCCTCGGTGAATGGCTCATCA
TCAGCAAATATCTTCTTCTCGACTTTACCGTCGTAGTATTTAGACATTGATGGTAGGTAGATGGAACTACTTTATTCTCA
TGTATCCAAGGCCTTCCAAGAAAGACGTTGTAAGAAGACTTTGCATCGATCGCATGTAGCCATGCACTTGATTTCATATC
>test1#1#Scaf1
TTCAATGGTAATTTCCAATCTGATCGCACCTATGGCCCTTTGGCCCCCCTTGGTTGAATCGTTGGATCATCATACGACTT
TCTGAGAGTTCGTTCATGGGAATGCAAACTTTTTTCACAGTTCGAATTGGCAAAATGTTCGCTGAGGATCCTCCATCAAC
CAAAATTTGATTTACCCTTTCATCACGCATATAGCCAACTAGGTAAAATGGGTGGTTACAAAGAGTGTCACCTAGCAGAA
GATCGTCATTTATGAACATGACTTTTTCTTCCCAACTATTAACTTGTTGAGGAGTGGAATCGATGAGCTTTTCTGGAGAT
>test1#1#Scaf2
AGTTCTAATGGTAGGTCATCACTCTTTTCTTCCCTTTTGTCTTCATGGTAACAAGAGGCATCGATTCCCTCATGGGAAAT
CTTCGTGCGGAACCAAGATGGAAAGAAATGCTCCAAGGTTATTGTATGTCGTGGCTTTTGAGGGTGGTGCATCTCCACTT
TTTCTTTCTTCAAAAGCCTAACAAGATTCTTTTCTGTAGGTCTTTTTACTGTTATTTTCCTTGTTGGTTGTTCTATTGAT
TCTTTTTCTAGGATCCTTTTGTGGTGCCTACGACAAGTCACCAACGTCCAACCTTCATTATCATCCGATTAATCGTCTCC

The Clojure script would get the following parameters:

  • input FASTA file name,
  • output FASTA file name,
  • for sample_name for example test1 and
  • for haplotype_id for example 1

The following script:

(ns convertFASTA2PanSN
  (:require [clojure.java.io :as io]))

(defn transform-header [header sample-name haplotype-id]
  (str ">" sample-name "#" haplotype-id "#" (subs header 1)))

(defn process-fasta-file [input-file output-file sample-name haplotype-id]
  (with-open [rdr (io/reader input-file)
              wrt (io/writer output-file)]
    (doseq [line (line-seq rdr)]
      (if (.startsWith line ">")
        (let [new-header (transform-header line sample-name haplotype-id)]
          (.write wrt (str new-header "\n")))
        (.write wrt (str line "\n"))))))

(defn -main [& args]
  (let [input-file (args 0)
        output-file (args 1)
        sample-name (args 2)
        haplotype-id (args 3)]
    (process-fasta-file input-file output-file sample-name haplotype-id)))

(set! *main-cli-fn* -main)

Caused an error Unable to resolve symbol: >Chr01 in this context

How is it possible to fix it?

1

There are 1 best solutions below

6
cfrick On

You are running your test.fasta file instead of your clojure-code. Since it only contains things, that the clojure read would understand as "Symbols" there is no parser error or alike.

Since you have not mentioned how you run it (clj, bb, lein, uberjar, ...) it's hard to give concrete advice. But it is roughly like this:

Assumed current call:

$ $CLOJURE test.fasta

Assumed correct call:

$ $CLOJURE script.clj test.fasta