I want to randomly change states in a sequence dataset for the purposes of simulation. The goal is to see how different measures of cluster quality behave with different degrees of structure in the data.
If I were to introduce missings, there is the handy seqgen.missing() function in TraMineRextras, but it only adds missing states. How would I go about randomly picking a proportion pof sequences and randomly inserting a randomly selected element of the alphabet to them with p_g, p_l, and p_r probabilities for inserting them in the middle, left, and right?
Below is a
seq.rand.chgfunction (adapted fromseqgen.missing) that randomly applies state changes to a proportionp.casesof sequences. For each randomly selected sequence the function randomly changes the state eitherWhen
p.gaps > 0, at a proportion between0andp.gapsof the positions;When
p.left > 0and/orp.right > 0, at an at mostp.left(p.right) proportion left (right) positions.As in the
seqgen.missingfunction, thep.gaps,p.left, andp.rightare the maximum proportion of cases changed in each selected sequence. These are not exactly your probabilitiesp_g,p_l, andp_r. But it should be easy to adapt the function for that.Here is the function:
We illustrate the usage of the function with the first three sequences of the
mvaddataWe observe that changes were applied to the randomly selected 3rd sequence.