The traditional viterbi algorithm using in HMM has a start probability matrix(viterbi algorithm wiki), but the params of viterbi_decode in tensorflow only need transition probability matrix and emission probability matrix. How to understand it?
def viterbi_decode(score, transition_params):
"""Decode the highest scoring sequence of tags outside of
TensorFlow.
This should only be used at test time.
Args:
score: A [seq_len, num_tags] matrix of unary potentials.
transition_params: A [num_tags, num_tags] matrix of binary potentials.
Returns:
viterbi: A [seq_len] list of integers containing the highest scoring tag
indicies.
viterbi_score: A float containing the score for the Viterbi
sequence.
"""
The viterbi algorithm in Tensorflow doesn't need an initial probability matrix because it starts decoding by giving zero probability to all states.
This means that it starts at state 0.
You can check out the implementation here.