Suppose we have 2 sequences(Arrays) A[1..m] and B[1..n] of integers.
How can we find the length of the biggest common contiguous subsequence of A and B in O(nlogn + mlogm + k^2) where k is the number of (x, y) tuples where A[x] = B[y]?
This is an example input:
A={1,2,3,4,5}
B={2,4,5,9,11,20}
output: 2
Note that it is not guaranteed that arrays are sorted:
A={2,7,1,8,11}
B={1,3,7,1,11,2}
output: 2
Let's refer to those "tuples (x,y) where
A[x] == B[y]as "correspondences".A common subsequence, then, corresponds to a list of correspondences
Cwhere there is are no two correspondencesC[i]andC[j]such thatC[i].x < C[j].x AND C[i].y > C[j].yor vice-versa.First, make a list of all correspondences, sorted by x then y. We can easily do this within the required complexity.
Then, we will iterate through this list and determine, for each correspondence, the length of longest subsequence that ends with that correspondence. The largest result is the problem solution.
We will maintain an array
BEST[y]which stores, for eachy, the best result so far where the final correspondencechasc.y == y. Initially this is all zeros.For each correspondence
c:BEST[y]such thaty <= c.y, and add 1 to get theresultforc. Remember the largest of these results, as it will be the solution to the problem.result > BEST[c.y], setBEST[c.y] = resultThat's it... but, in order to meet your complexity requirement, that search in step (1) needs to be done in O(k) time, even if k is much smaller than max(y). You can do that by keeping a list of the non-zero values instead of an array. There can be at most k of those, so you can search and update the list in O(k) time easily.
If you keep those entries in a binary search tree, and additionally remember the maximum value in each subtree, then you can reduce the complexity to O( nlogn + mlogm + klogk ).