I have a task:
Given the array of 7 bytes. Representing it as an array of 8 7-bit words, count the number of words with an odd number of zeros in them.
Here's my code so far. I have a code sample with the check for the number of 1's oddity but can't understand how to check for an odd number of zeros in the word. How can I do it?
data segment
result dw ?
NB dw 04h, 07h, 14h, 23h, 04h, 38h, 3Fh, 2Ah
data ends
code segment
assume cs: code, ds: data
start: mov ax, data
mov ds, ax
lea bx, NB
mov cx, 7
BEG:
mov al, [bx]
test al, 0ffh
jp OK
or add result, 1
loop BEG
quit: mov ax, 4C00h
int 21h
exit: ret
code ends
end start
Your current solution is misinterpreting the task description!
For now, you have given the array 8, word-sized elements, whereas the task wants you to deal with 8 elements that are 7 bits wide and importantly those 8 elements have to be packed together in just 7 bytes of memory!
The array in effect is a bitstring that has 56 bits (8*7).
What does it mean to 'count the number of words with an odd number of zeros in them'?
word here is referring to a group of 7 bits, it's confusing I know.
The below snippet uses the
bt(BitTest) instruction to test if a certain bit in a string of bits is ON (CF=1) or OFF (CF=0).The above code can easily be written without using that
cmcinstruction (see Peter's comment).Doing so will reveal a substantial speedgain. On my computer, executing the code a thousand times made the execution time drop from 597 µsec to 527 µsec (11% better).
This code will tally based on the inverse condition and subtract (the lowest bit of) the tally from the maximum achievable result.
An alternative solution would extract 7 consecutive bits from the bitstring into a byte and (just like you did) inspect the parity flag PF. An odd number of zeroes in a group of 7 bits means that there will be a complementary even number of ones (PF=1). This solution is more than 4 times faster than the above snippets!
Next is an implementation of this:
(*) The above code reads one byte before the array and reads one byte after the array. We have to make sure that these are addressable.
At the expense of increased codesize, we could split off the first and last iterations and not read outside of the array at all.
In case this should be 8086 code, you can not use the
btorsetpinstructions. Then next change to the 3rd snippet will work:From:
to:
ps. Don't use this
jnpon anything better than 8086 as it will make the loop more than 10 times slower!