Bits: Search&Replace bit sequences

90 Views Asked by At

Being a high-level programmer myself, I struggle a lot with bitwise operations. I hope what I'm trying to achieve is doable?

Let's say I have one unsigned 8-bits integer - it can be any value. Let's work on two examples: 0xe4 (11100100) and 0xa5 (10100101).

The hardware interprets those as 4 chunks: 11 10 01 00 and 10 10 01 01.

I'm trying to write 3 methods in pure C:

  • Replace all 00 chunks with 01 chunks. (result: 11 10 01 01 and 10 10 01 01)
  • Replace all 01 chunks with 10 chunks. (result: 11 10 10 10 and 10 10 10 10)
  • Replace all 10 chunks with 11 chunks. (result: 11 11 11 11 and 11 11 11 11)

Tried to search for bit-replacing methods, but couldn't bend any to solve this particular requirement. Where should I start?

3

There are 3 best solutions below

4
Tom Karzes On BEST ANSWER

Here are some fast, non-looping one-liners that do what you want:

unsigned set_00_to_01(unsigned x) {
    return x + ((~x >> 1) & ~x & 0x55);
}

unsigned set_01_to_10(unsigned x) {
    return x + ((~x >> 1) & x & 0x55);
}

unsigned set_10_to_11(unsigned x) {
    return x + ((x >> 1) & ~x & 0x55);
}

Note that they only operate on the low-order 8 bits, but they could easily be changed to operate on more by changing the 0x55 values to 0x5555 or 0x55555555 for instance.

Each function is hard-wired to its specific task, so you can't pass in arbitrary values. Their main advantage is their speed.

0
0___________ On
unsigned replace2Bits(unsigned haystack, unsigned search, unsigned replace, unsigned numchunks)
{
    unsigned mask = 3;
    while(numchunks--)
    {
        if((haystack & mask) == search)
        {
            haystack &= ~mask;
            haystack |= replace;
        }
        mask <<= 2;
        search <<= 2;
        replace <<= 2;
    }
    return haystack;
}

void printbin(unsigned x)
{
    for(unsigned bit = 0x80; bit; bit >>= 1)
    {
        putchar((x & bit) ? '1':'0' );
    }
}

int main()
{
    unsigned char hs1 = 0xe4;
    unsigned char hs2 = 0xa5;

    printbin(hs1);putchar(' ');printbin(hs1 = replace2Bits(hs1,0b00, 0b01, 4)); putchar(' ');
    putchar(' ');printbin(hs1 = replace2Bits(hs1,0b01, 0b10, 4)); putchar(' ');
    putchar(' ');printbin(replace2Bits(hs1,0b10, 0b11, 4)); putchar('\n');

    printbin(hs2);putchar(' ');printbin(hs2 = replace2Bits(hs2,0b00, 0b01, 4)); putchar(' ');
    putchar(' ');printbin(hs2 = replace2Bits(hs2,0b01, 0b10, 4)); putchar(' ');
    putchar(' ');printbin(replace2Bits(hs2,0b10, 0b11, 4)); putchar('\n');
}

https://godbolt.org/z/dcsuuD

0
Jerry Coffin On

First of all, I'm going to assume you're not planning to carry those operations out consecutively in a single operation. If you really do that, the result will always be 0xff, regardless of the input, so you don't need to deal with the input, breaking it into fields, or much of anything else. So I'm going to assume that if a pair of bits in the input started as 00, then final result for that field should be 01, likewise if it was 01, it should end up as 10, and if it was 10 it should end up as 11 (then later, you might call it again so each takes the next step, and so on).

Second, I'm going to note that those operations are the same as simply incrementing that 2-bit field. That is: 0->1, 1->2, 2->3 (and 3, we presumably leave alone, though you didn't specify).

One easy way to get to a few bits in a C data type is to use its bitfield support. In this case, we'll use it in conjunction with a union, so we have one member of the union that gives us access to individual fields, and the other to the entire byte as a unit.

#include <stdio.h>

// depending on how new it is, your compiler may already define this:
typedef unsigned char uint8_t;

struct Foo {
    uint8_t a : 2;
    uint8_t b : 2;
    uint8_t c : 2;
    uint8_t d : 2;
};

typedef union { 
    unsigned char whole;
    struct Foo parts;
} val;

So now that we have access to the pieces, let's write a little code to demonstrate putting it to use:

int main(void) {
    val a = {.whole = 0xe4};

    // replacing 00 with 01, 01 with 10 and 10 with 11 is incrementing, except for 11
    if (a.parts.a < 3)
        ++a.parts.a;
    if (a.parts.b < 3)
        ++a.parts.b;
    if (a.parts.c < 3)
        ++a.parts.c;
    if (a.parts.d < 3)
        ++a.parts.d;

    printf("%2.2x\n", a.whole);
}

Warning: in theory, writing to one field of a union, then reading from another doesn't give defined results. In reality, however, code that interfaces to hardware does things like this fairly routinely. What varies is the ordering--whether (for example) a.parts.a is the least significant or more significant bits in a.whole. Unfortunately, that can vary not only with the platform, but between different compilers on the same platform. On the other hand, in your case it simply doesn't matter.