I am trying to remove NAs in R. I have tried to replicate a simple example I have found multiple places online but am getting an unexpected output. I cannot find the error through searching online. What am I doing wrong?
I am using R version 4.3.2. I have restarted R and cleared the global variables (and restarted R again) and consistently get this result with anything I try.
a <- c(1,2,NA,3,4,NA,5,6)
b<- na.omit(a)
b
The output is
[1] 1 2 3 4 5 6
attr(,"na.action")
[1] 3 6
attr(,"class")
[1] "omit"
I was expecting to get the output 1 2 3 4 5 6
I have found I can instead use b <- a[!(is.na(a))], but curious why the commonly suggested na.omit does not work.
You do get the intended values in the output. What I think you misunderstand is that the
attr(,"na.action")andattr(,"class")are simply attributes attached to the numeric vector with six non-NAnumbers in it. If you dob+1, you'll get the values incremented:If you really want to use
na.omitand remove the attributes, you can do:Ultimately, though,
a[!is.na(a)]is much much faster, and still should be safe. Look at the`itr/sec`field to see thata[!is.na(a)]is ~10x faster on this small vector.Even on a medium-large vector, it's still faster:
But if it gets a lot larger, we start seeing some parity:
but since we're talking on the order if 2-3ms for a vector 800,000 long, the payoff might not be worth the squeeze.