Viterbi Algorithm – NLP

Let’s try out the Viterbi Algorithm using the example given in the ritvikmath channel using R. It is about parts of speech tagging of a sentence,
“The Fans Watch The Race”.

The transition and emission probabilities are given in matrix forms.

hmm <- initHMM(c("DET","NOUN", "VERB"), c("THE","FANS", "WATCH", "RACE"), transProbs=matrix(c(0.0, 0.0, 0.5, 0.9, 0.5, 0.5, 0.1, 0.5, 0.0), nrow = 3),
	emissionProbs=matrix(c(0.2, 0.0, 0.0, 0.0, 0.1, 0.2, 0.0, 0.3, 0.15, 0.0, 0.1, 0.3), nrow = 3))
print(hmm)
$States
[1] "DET"  "NOUN" "VERB"

$Symbols
[1] "THE"   "FANS"  "WATCH" "RACE" 

$startProbs
      DET      NOUN      VERB 
0.3333333 0.3333333 0.3333333 

$transProbs
      to
from   DET NOUN VERB
  DET  0.0  0.9  0.1
  NOUN 0.0  0.5  0.5
  VERB 0.5  0.5  0.0

$emissionProbs
      symbols
states THE FANS WATCH RACE
  DET  0.2  0.0  0.00  0.0
  NOUN 0.0  0.1  0.30  0.1
  VERB 0.0  0.2  0.15  0.3

Now, write down the observations (The Fans Watch The Race) and run the following commands.

observations <- c("THE","FANS", "WATCH", "THE", "RACE")

vPath <- viterbi(hmm,observations)

vPath 
 "DET"  "NOUN" "VERB" "DET"  "NOUN"

References

The Viterbi Algorithm: ritvikmath