I have the following dataset where y is a state you jump in at the time. PersonID is an identifier for the person. In this case, I have three states, and hence I can have three lines for each unique person.
Is there any easy way, for instance using the Survival Package to get it on occurrence/exposure form?
Ideally, I would want it on the form:
personID, time, exposure1, exposure2, exposure3, occ12, occ13, occ21, occ23, occ31, occ32. For instance:
personID time y
1 8 0.000000 1
2 8 2.972433 2
3 8 3.113432 3
I want it to be on the format:
personID, time, exposure1, exposure2, exposure3, occ12, occ13, occ21, occ23, occ31, occ32
8, 0.1, 0.1, 0, 0, 0, 0 ....
8, 0.2, 0.1, 0, 0, 0, 0 ....
...
8, 3, 0.072433, 0.027567, 0, 0, 0, 1, 0, 0, 0
8, 3.1, 0, 0.1, 0, 0, 0, 0, 0, 0, 0
8, 3.2, 0, 0.013432, 0.086568, 0, 0, 0, 1, 0, 0
8, 3.3, 0, 0, 0.1, 0, 0, 0, 0, 0, 0
8, 3.4, 0, 0, 0.1, 0, 0, 0, 0, 0, 0
...
8, ENDTIME, 0, 0, 0.1, 0, 0, 0, 0, 0, 0
Is this done easily? I don't care about computationtime. It should be done only a few times.
I tried a manual for-loop, but it felt like there was a smarter way. Also the loop didn't work, there was a lot of cases it couldn't handle.
I Googled a lot, and found some commands from the Survival package, but I don't know how to adapt it to my situation.
Exposure is the amount of time you have spend in the state before jumping into another state. That is, for person 8 from before, we split the time-line up into lengths of 0.1 (right endpoints - so from 0.0 to 0.1), and we see how long the person was in e.g. state 1 in that period of time
So in this case, the exposure1 at time 0.1 is 0.1, because he was at y = 1 in the whole period.
Using data %>% tidyr::pivot_wider(names_from = y, values_from = c(time, y)) I get part of the way. Here I can cross join a grid from start time to end time that is split up. However, this cannot handle the case where I jump back to e.g state 2 from state 3.
Representative sample:
structure(list(personID = c(4L, 5L, 5L, 6L, 7L, 7L, 8L, 8L, 8L),
time = c(0, 0, 0.541094619588305, 0, 0, 0.36904754187606,
0, 2.97243252930055, 3.11343245920189), y= c(1L, 1L,
2L, 1L, 1L, 2L, 1L, 2L, 3L)), row.names = 5:13, class = "data.frame")