How to make an accent insensitive palindrome checker in MIPS?

133 Views Asked by At

I am writing a palindrome checker in MIPS, and I was trying to make it accent insensitive so that something like "ahà" would be considered a palindrome too. However, it doesn't look so simple as the case insensitive scenario where there is a fixed value between a lowercase and an uppercase letter.

I asked my teacher about it and she said that I could check the entire string and replace any "è" with "e", then check it again to replace any "é" with "e" and so on, but she told me there is a better solution and asked me to think about it. The only thing I have noticed so far is that the accents are in the extended ASCII code, so > 127, but I can't seem to understand what to do. Can someone help me? Even just a hint would be appreciated, thank you in advance.

1

There are 1 best solutions below

0
puppydrum64 On

You're going to have to hardcode this one with a lookup table like Alain Merigot suggested. How you do this depends on your string encoding scheme (ASCII vs. UTF-8, etc.)

For ASCII, I whipped this up and it should work:

.data

ascii_strip_accent_table:
# index: U+nnnn offset, minus 128
.space 0x40 ;table doesn't really start until U+00C0
.ascii "AAAAA"
.byte 0xC6
.ascii "C"
.ascii "EEEE"
.ascii "IIII"
.ascii "D"
.ascii "N"
.ascii "OOOOO" ;these are capital Os, not zeroes
.byte 0xD7
.ascii "O"  ;this is a capital O, not a zero
.ascii "UUUU"
.ascii "Y"
.byte 0xDE,0xDF
.ascii "aaaaa"
.byte 0xE6
.ascii "c"
.ascii "eeee"
.ascii "iiii"
.ascii "d"
.ascii "n"
.ascii "ooooo"
.byte 0xF7
.ascii "o"  
.ascii "uuuu"
.ascii "y"
.byte 0xFE
.ascii "y"

MyString:
.asciiz "Pokémon"
.text

la $a0,ascii_strip_accent_table
la $a1,MyString
li $t2,128

loop:
lbu $t0,($a1)             # read from string
beqz $t0,done           
bltu $t0,$t2,continue     # if char < 128, skip
   subu $t0,$t0,$t2       # subtract 128 to get array index
   move $a2,$a0           # backup table base
   addu $a2,$a2,$t0       # add array index to table base 
   lbu $t0,($a2)          # load from table
   sb $t0,($a1)           # store in string
continue:
addiu $a0,$a0,1
j loop

done:
li $v0,10
syscall

EDIT: Now if you're like me and you can't stand unnecessary padding, you can actually remove that .space 40 at the beginning if you la $a0,ascii_strip_accent_table-64 instead. Whether you're willing to take that risk, is up to you.