What's the most efficient way to convert a factor vector (not all levels are unique) into a numeric vector in bash? The values in the numeric vector do not matter as long as each represents a unique level of the factor.
To illustrate, this would be the R equivalent to what I want to do in bash:
numeric<-seq_along(levels(factor))[factor]
I.e.:
factor
AV1019A
ABG1787
AV1019A
B77hhA
B77hhA
numeric
1
2
1
3
3
Many thanks.
It is most probably not the most efficient, but maybe something to start.
I initially wanted to use associative arrays, but it's a bash 4+ feature only and not available here and there. If you have bash 4 then you have one file less, which is obviously more efficient.