'highlighting' a string according to a given pattern

109 Views Asked by At

I have a bunch of strings such as

1245046126856123
5293812332348977
1552724141123171
7992612370048696
6912394320472896

I give the program a pattern which in this case is '123' and the expected output is

1245046126856[123]
52938[123]32348977
1552724141[123]171
79926[123]70048696
69[123]94320472896

To do this I record the indices where the pattern occurs in an array and then I make an empty array of chars and put '[' and ']' according to the indices. So far it works fine but when I have a string such as

12312312312312312123

I should get

[123][123][123][123][123]12[123]

However I cannot find a way to record the indices according to such a case. I use rabin-karp algorithm for pattern matching and the section where I calculate the indeces as to where to put the brackets is as follows

if(j == M){
      index[k] = i; //beginning index
      index[k+1] = i+M+1; //finishing index
      if((k!=0)&&(index[k-1] == index[k])){
           index[k]++;
           index[k+1]++;
      }

      if((k!=0)&&(index[k] < index[k-1])){
           index[k] = index[k-2]+M+1;
           index[k+1] = i-index[k-1]+M+1;
      }
      k += 2;
}

i is the index where the pattern starts to occur, j is the index where the algorithm terminates the pattern check (last character of the given pattern), k is the index of the index array. M is the length of the pattern

This results in a string (where only the brackets are placed) like this

[   ][   ][   ][   ][   ][   ]

but as you can see, there should be two empty spaces between the last two sets of brackets. How can I adjust way I calculate the indexes so that I can properly place the brackets?

1

There are 1 best solutions below

0
Daniel On

EDIT

thought it was a python question at first so this is a pythonic answer, but it may help as a pseudo code.

this piece of code should help you find all the indexes in the string that holds the pattern.

    string = "12312312312312123"
    ptrn = "123"
    i = 0
    indexes = [] //create a dynamic array (it may also be constant size string length/pattern length or just the string length)
    while True:
        i = string.find(ptrn, i) //get the next index of the pattern in a substring that starts from the last index of last suffix of the pattern.
        if i == -1: //if such index inside the original string (the pattern exists).
            break
        indexes.append(i) //add the found index of the pattern occurrence into the array.
        i += len(ptrn) //get to the next position where the pattern may appear not inside another pattern.

    print(indexes)

if you would like to have it on every pattern match even if it's inside another match, you can remove the i+=len(ptrn) and replace the while statement with for i in range(0,len(string)): // run for every index of the string - for(int i=0; i<strlen(string); i++)