AWS state machines emit log events like TaskStarted and TaskSucceeded which can be used to calculate the total duration of a particular task. However, this makes it challenging to obtain the duration using Log Insights or even by creating a Lambda-backed Cloudwatch subscription (since TaskStarted and TaskSucceeded log events can be in separate Lambda invocations).
Ideally, I would like to have this duration emitted as a metric. E.g. if the state machine task calls to a Sagemaker Endpoint "X" with parameter TargetModel=abc.tar.gz, I'd like to have a metric named Endpoint_X_abc with the total duration of this task.
What is the proper way to do this?