I have two lists, each sorted by start_time and that the end_time does not overlap with other items:
# (word, start_time, end_time)
words = [('i', 5.12, 5.23),
('like', 5.24, 5.36),
('you', 5.37, 5.71),
('really', 7.21, 7.51),
('yes', 8.32, 8.54)]
# (speaker, start_time, end_time)
segments = [('spk1', 0.0, 1.25),
('spk2', 4.75, 6.25),
('spk1', 6.75, 7.75),
('spk2', 8.25, 9.25)]
I want to group the items in words that fall within the start_time and end_time of each item in segments and produce something like this:
res = [('i', 'like', 'you'),
('really'),
('yes')]
such that each item in res contains all the items of words with start_time and end_time falling between the start_time and end_time of the corresponding item in segments.
single loop should be fast.
Happy Sunday!