Calculating ECDF in R for time period (lubridate)

46 Views Asked by At

So I am trying to find a way to calculate the ecdf for specific values of my data. I have a data frame that looks something like this:

Name      Type             Value 
B         pace_20min_ms    6M 2S
A         pace_20min_ms    5M 32S

So what I want to do is: Find the value of the ecdf for example for A and say: A is faster than 65% of people, who have done the test. But I am struggling with the "Value", as this is in this lubridate format Minutes and Seconds.

What I figured out so far is how to calculate specific quantiles:

quantile(dat$Value, probs = c(0.1, 0.25, 0.5, 0.75, 0.9), type = 1)
[1] "3M 57S" "4M 25S" "4M 56S" "5M 32S" "6M 2S"

Maybe it's not that hard to calculate it the other way around, but I don't know how to do it. Thank you so much!

1

There are 1 best solutions below

1
jay.sf On BEST ANSWER

You could convert to seconds and back again, sth like:

> r <- colSums(sapply(strsplit(gsub('[MS]', '', x), ' '), as.integer)*c(60, 1)) |> 
+   quantile(probs=c(0.1, 0.25, 0.5, 0.75, 0.9), type=1)
> sprintf('%sM %sS', r %/% 60, r %% 60) |> setNames(names(r))
     10%      25%      50%      75%      90% 
"0M 20S" "1M 23S" "3M 24S"  "5M 5S" "6M 31S" 

Not sure how your data is exactly formatted, but you get the idea.


Data:

> n <- 100
> set.seed(42)
> x <- mapply(\(x, y) sprintf('%sM %sS', x, y), 
+             sample(0:7, n, replace=TRUE), 
+             sample(0:34, n, replace=TRUE))