I understand the concept of CMTime and what it does. In the nutshell, we have very tiny fractions of a second represented as floating point numbers. When added, they accumulate an error, which becomes significant as decoding / playback progresses. For example, summing up one million times 0.000001 gives us 1.000000000007918. Okay, CMTime sounds like a great idea.
let d: Double = 0.000001
var n: Double = 0
for _ in 0 ..< 1_000_000 { n += d }
print(n)
// 1.000000000007918
However, when attempting to convert a random Double to and from CMTime the above error looks like a joke compared to the difference between the original Double and its CMTime value. You can guess what would that difference look like after adding these random CMTime values a million times!
import CoreMedia
print("Simple number after 1,000,000 additions and diff between random ")
print("number before/after converting to CMTime:")
print("add:", String(format: "%.20f", 1.000000000007918))
for _ in 0 ..< 10 {
let seconds = Double.random(in: 0 ... 10)
// Let's go with the max timescale!
let time = CMTime(seconds: seconds, preferredTimescale: .max)
print("dif:", String(format: "%.20f", seconds - time.seconds))
}
// Simple number after 1,000,000 additions and diff between random
// number before/after converting to CMTime:
// add: 1.00000000000791811061
// dif: 0.00000000025481305954
// dif: 0.00000000027779378797
// dif: 0.00000000000071231909
// dif: 0.00000000024774449159
// dif: 0.00000000028195579205
// dif: 0.00000000029723601358
// dif: 0.00000000029402880131
// dif: 0.00000000044737191729
// dif: 0.00000000036750824606
// dif: 0.00000000043562398133
On the other hand, yes, if any given Double can be accurately converted to CMTime, then this wouldn't be an issue.
Question. I'm trying to figure out if it makes sense to use CMTime on its own for time handing (apart from a million additions, obviously) or is it only useful for working with APIs that take and return values in CMTime format? To give some context, I have a video editing app with bespoke UI (player, tracks, timelines) that deals with playback speed adjustments, track trimming and rearranging, etc. Using Double to express time values works out great, it's clean, simple and does the job. But CMTime feels like the "right" way to do it. However, seeing what happens to a Double after converting it back and forth makes me wonder CMTime's field of use is as narrow as encoding and decoding media?
Your intuition is correct. Using a screwdriver as a hammer may work most of the time, but it's not the best use. More importantly, it may be missing some non-obvious edge cases where it just won't work or will cause more work to hammer in the nail (such as double processing).
Secondly, what is your conversion method? Perhaps you are missing an edge case such as varying timescale. I can't really give further guidance without a bit more information.
CMTime is already frame-accurate with AVPlayer without conversion. That's what it was made for, though make sure you set toleranceBefore and toleranceAfter to zero.
Note: I've been working with frame-accurate video/audio processing for over a decade.