Can combination of multiplication, addition, or,xor, not ... induce permutation action on binary digits of integer?

169 Views Asked by At

Setup: Consider integers say int8 (or int16, int32, int64).
Each integer is sequence of 8 binary digits e.g. 7 = 00000111.
There is simple natural action of the permutations of 8 letters on these digits.
The most simple example is cyclic shift - and cyclic shift is supported in most of the programming languages.

Question: Are there examples of other permutations which can be implemented via one or combination of instructions like multiplication by something, addition with something, xor with smth, or with smth , not, etc.. (i.e. those instructions which are supported in most programming languages and even in assembler level)?

To clarify: I am searching for one or combination of commands such that it will act as permutation on ALL binary sequence. I.e. when you apply it to ANY(!) int8 "x" - resulting sequence will be permutation of digits "x" - i.e. the number of "1" and "0" will be the same as in input "x".

Negative example Operation - multiply by say "3" (when overflow - take remainder) - acts as bijection on say int8, but NOT as permutation, because it does not preserve the number of "1" and "0", that means input "x" and output "3x" are not permutations of each other. (For example 3 * 1 = 3 = (00000111) - here are three "1", while input contained only one "1": (00000001).

Positive example Operation - cyclic shift. Input "x", output "shift-x" - is clearly permutation of digits binary digits.

Relaxed versions of the question: It would be interesting if there exists such examples which act as permutations on ONLY subset(s) of numbers containing say 4 - digits "1" (i.e. subsets - preserved by permutations). For examples there are 28 = C²₈ numbers with two digits equal to "1" - is there any instruction (instructions sequence) which acts on that subset of integers as permutation ?

Generalization of the question: combining 2-subsequent digits in one symbol int8 - will be a sequence of 4-symbols ABCD, where each symbol can be 00, 01, 10, 11. We can ask the same question on action realization of action of S₄ on such symbols. And so on.

Motivation: In research math sometimes we need to make computational experiments with permutations, but since n! grows fast we are limited. If there would be some quick way to works with at least S₈, S₁₆, S₃₂, S₆₄: int8, int16, int32, int64 - that might be helpful in some questions.

More context: https://www.kaggle.com/competitions/santa-2023/discussion/473074

3

There are 3 best solutions below

7
Stef On BEST ANSWER

At the very least you can express any permutation

(i8, i7, i6, i5, i4, i3, i2, i1)

by writing:

y = ((x & 1) << i1) | ((x & 2) << (i2-1))
  | ((x & 4) << (i3-2))| ((x & 8) << (i4-3))
  | ((x & 16) << (i5-4)) | ((x & 32) << (i6-5))
  | ((x & 64) << (i7-6))| ((x & 128) << (i8-7));

The (x & poweroftwo) part selects a bit of x, and the << (i - j) part moves it from its old position j to its new position i.

The bitwise or | operators combine all the individual bits. In fact a simple + would do the same thing as | in this case.

I'm abusing notation a little bit: in the formula above, I wrote only left-shifts, but some of the numbers can be negative. In C that would be undefined behaviour, so you have to replace << (i5-4) with >> (4-i5), etc., where necessary.

If a particular permutation has a bit more "logic" to it then you can probably describe this permutation more succinctly than by moving each bit separately.

For instance if the permutation is the union of two disjoint cycles, then you can perform two cyclic shifts and recombine them and this will be shorter to write (and execute) than combining 8 bit shifts.

In fact, all permutations can be decomposed as a union of disjoint cycles, and in most cases this should lead to a slightly more expressive formula than the one above.

However there are 8! = 40320 possible permutations of 8 elements, and the average number of cycles in a random permutation of 8 elements is about 3. And when the cycles consist in adjacent bits, that's super helpful, but when the cycles are intermingled with one cycle of bits 1,2,7 and one cycle of bits 3,5,8 and one cycle of bits 4,6, it's still going to be hard to describe succinctly.

So for most permutations you shouldn't expect the formula to be too compact.

Note that "most" in the previous sentence kinda assumes that you are dealing with a random permutation, selected among all possible permutations. In practical applications it's possible that you deal with "simple/logical" permutations more often than with "complex" permutations, even though the total number of "complex" permutations is higher.

You might be interested in reading about Kolmogorov complexity, a concept invented precisely to describe "how logical / easy to describe" is a mathematical object.

0
gary On

You might be interested in the snoob() function, which permutes the bits of the argument, returning the next higher integer with the "same number of one-bits" (hence the name).

https://github.com/hcs0/Hackers-Delight/blob/master/snoob.c.txt

0
harold On

combination of instructions like multiplication by something, addition with something, xor with smth, or with smth , not, etc..

These are not very powerful, but you can implement for example the steps of a butterfly network and their inverse, which you can compose into a Beneš network, which can implement any permutation (the steering bits are a bit annoying to calculate at runtime, in case you want an arbitrary bit-permuter that you can configure at runtime). That is very general, but not very efficient (though note that this scales much better to eg 64-bit integers than moving each bit individually). For many permutations of practical interest, you can find a special sequence of operations.

You can find more information about that topic (and a tool that searches for specialized sequences of operations for specific permutations) here: http://programming.sirrida.de/bit_perm.html

Going beyond the common set of operations, using AVX512 you can implement any 64-bit bit-permutation in 3 instructions:

uint64_t faster_bit_shuffle(uint64_t w, uint8_t indexes[64]) {
  __m512i as_vec_register = _mm512_set1_epi64(w);
  __mmask64 as_mask = _mm512_bitshuffle_epi64_mask(as_vec_register,
     _mm512_loadu_si512(indexes));
  return _cvtmask64_u64(as_mask);
}

It's very easy to use, you don't even need to calculate any "steering bits", just put in an array of indexes.

At only modestly higher cost, you can apply the same bit-permutation to 8 64-bit integers simultaneously: (heck of a lot of code, but most of it is just gigantic 64-byte constants)

__m512i Transpose8x64(__m512i x)
{
    x = _mm512_permutexvar_epi8(_mm512_setr_epi8(
        56, 48, 40, 32, 24, 16, 8, 0,
        57, 49, 41, 33, 25, 17, 9, 1,
        58, 50, 42, 34, 26, 18, 10, 2,
        59, 51, 43, 35, 27, 19, 11, 3,
        60, 52, 44, 36, 28, 20, 12, 4,
        61, 53, 45, 37, 29, 21, 13, 5,
        62, 54, 46, 38, 30, 22, 14, 6,
        63, 55, 47, 39, 31, 23, 15, 7), x);
    x = _mm512_gf2p8affine_epi64_epi8(_mm512_set1_epi64(0x8040201008040201), x, 0);
    return x;
}

__m512i Transpose64x8(__m512i x)
{
    x = _mm512_shuffle_epi8(x, _mm512_setr_epi8(
        7, 6, 5, 4, 3, 2, 1, 0,
        15, 14, 13, 12, 11, 10, 9, 8,
        23, 22, 21, 20, 19, 18, 17, 16,
        31, 30, 29, 28, 27, 26, 25, 24,
        39, 38, 37, 36, 35, 34, 33, 32,
        47, 46, 45, 44, 43, 42, 41, 40,
        55, 54, 53, 52, 51, 50, 49, 48,
        63, 62, 61, 60, 59, 58, 57, 56));
    x = _mm512_gf2p8affine_epi64_epi8(_mm512_set1_epi64(0x0408201002014080), x, 0);
    x = _mm512_permutexvar_epi8(_mm512_setr_epi8(
        2, 10, 18, 26, 34, 42, 50, 58,
        3, 11, 19, 27, 35, 43, 51, 59,
        7, 15, 23, 31, 39, 47, 55, 63,
        6, 14, 22, 30, 38, 46, 54, 62,
        4, 12, 20, 28, 36, 44, 52, 60,
        5, 13, 21, 29, 37, 45, 53, 61,
        1, 9, 17, 25, 33, 41, 49, 57,
        0, 8, 16, 24, 32, 40, 48, 56), x);
    return x;
}

// bits = 8 x uint64
// permutation = array of bytes specifying source indices of bits
// result = each uint64 bit-permuted according to the permutation
__m512i Any64BitPermute(__m512i bits, __m512i permutation)
{
    bits = Transpose8x64(bits);
    bits = _mm512_permutexvar_epi8(permutation, bits);
    return Transpose64x8(bits);
}

Sadly, at this time, most CPUs do not support AVX512.