What is wrong with my logic in Burrows-Wheeler reverse transformation?

191 Views Asked by At

I'm studying a Burrows-Wheeler transformation and so far I can get it from some Text. Now it's time for the reverse process and that's what I have trouble with.

Here's the input: TTCCTAACG$A.

That's my way of thinking:

1) compute the number of As, Cs, Gs, Ts in the input: A: 3, C: 3, G: 1, T: 3

2) let's write down the First and the Last column of Burrows-Wheeler transformation. The last column is our input. So here it is:

      F    L

[0]   $    T
[1]   A    T
[2]   A    C
[3]   A    C
[4]   C    T
[5]   C    A
[6]   C    A
[7]   G    C
[8]   T    G
[9]   T    $
[10]  T    A

Here's my logic:

  1. Initially, output = '$'
  2. L[0] = 'T' => output = 'T$'
  3. The first T in F has the index 8 => we need L[8] => output = 'GT$'
  4. The first G in F has the index 7 => we need L[7] => output = 'CGT$'
  5. The first C in F has the index 4 => we need L[4] => output = 'TCGT$'
  6. It was our second T. The second T in F has the index 9, but L[9] = '$', thus
    we should stop.

Obviously, it's not over and something's wrong here. Could you please explain what?

2

There are 2 best solutions below

0
alekscooper On

I understood this method too simplistically. In step 4 since C was the third C, we need F[6].

0
dim On

The last column looks wrong - it should be the first column preceding symbol. You also do not use the special symbol for the BWT. This way the previous rule is broken and you disturb your lf mapping.

D.