Is there a way to optimize KMP algorithm to involve the character we are comparing?

212 Views Asked by At

I noticed that KMP algorithm simply consults failure function when it finds a mismatch at Haystack[a] and Needle[b] but it never look at what Needle[b] is. Is it possible to make KMP faster if we considered what Needle[b] is?

1

There are 1 best solutions below

0
templatetypedef On

You could in principle update KMP so that the failure table looks at the input character that caused the mismatch. However, there's a question of how much it would cost you to do this and how much benefit you'd get by doing so.

One of the advantages of KMP is that the preprocessing step takes time O(n), where n is the length of the pattern string. There's no dependence on how many characters the string is composed of (e.g. Unicode, ASCII, etc.) Simply building a table associating each character with what to do in some circumstance would negatively impact the runtime of the preprocessing step.

That being said, there are other algorithms that do indeed look at the mismatched character in the needle. The Boyer-Moore algorithm, for example, does decide where to search next based on which mismatched character is found, and it can run much faster than KMP as a result. You may want to look into that algorithm if you're curious how this works in practice.