I am running the msa package to create a DNA alignment for the phangorn package and it crashes with this error
I am running RStudio with R v4.3.1 on an M1 Mac Book Pro
mult <- msa(seqs, method="Muscle", type="dna", order="input")
That results in:
*** ERROR *** MSA::SetIdCount: cannot increase count
Fatal error, exception caught.
Error in msaFun(inputSeqs = inputSeqs, cluster = cluster, gapOpening = gapOpening, :
MUSCLE finished by an unknown reason
The input file is 6440 DNA seqs with an average length 1500 bp.
When I run Muscle in a stand alone mode on a linux virtual server I have to use the super5 option, because it gives this warning
WARNING: >1k sequences, may be slow or use excessive memory, consider using -super5
Floating point exception (core dumped)
(base) ubuntu@jrmicl-clovr:~/muscle$ muscle5 -super5 seqs.fas -output seqs_aln.fas
Any ideas - is this a memory thing?
I dove into the MSA source code a little bit and found that the error comes from a compare function in the msa.cpp file - lines 651-665.
The way I interpret the code is that the mUID (the muscle UID?) is not the same as the "regular" UID, wherever that comes from. ClustalOmega does not give me this issue so I am probably going to use that instead.