I am referring "Algorithms fourth Edition by Sedgewick & Wyane" String matching Chapter 5 .
The given algorithm is KMP substring search in which it build a DFA from pattern state . I understand the algorithm for building the DFA , code is as follows :
public KMP(String pat) {
this.R = 256;
this.pat = pat;
// build DFA from pattern
int m = pat.length();
dfa = new int[R][m];
dfa[pat.charAt(0)][0] = 1;
for (int x = 0, j = 1; j < m; j++) {
for (int c = 0; c < R; c++)
dfa[c][j] = dfa[c][x]; // Copy mismatch cases.
dfa[pat.charAt(j)][j] = j+1; // Set match case.
x = dfa[pat.charAt(j)][x]; // Update restart state.
}
}
I am not able to get the following line : x = dfa[pat.charAt(j)][x]; // Update restart state.
I understand that this value is achieved by feeding the pat[1..j-1] in partial build DFA but not able to get that the code,how it is achieving this.
I also understand that x is the length of longest prefix of pattern that the also suffix.
I have seen many other related question but those are related to understand the algorithm itself.
I need to understand that how x = dfa[pat.charAt(j)][x]; // Update restart state. simulating the restart state .
If we look carefully, X is initialized to state 0, and J is to state 1
Now, we just keep moving both forward based on next character visited, and since X is behind J he already knows which state is next, by default ALL ARE POINTING BACK TO 0 so that line will always maintain the prefix, if any otherwise restart at 0
dfa[c][j] = dfa[c][x]; // Copy mismatch cases.This line is just creatingfailureorback pointersx = dfa[pat.charAt(j)][x]; // Update restart state.And this line is moving the prefix ahead, to stay in sync with J, so it always point to a place where prefix == suffixperhaps this would help further https://labuladong.gitbook.io/algo-en/i.-dynamic-programming/kmpcharactermatchingalgorithmindynamicprogramming