Is this fixup to FNV-1a hash a good idea or a bad idea?

89 Views Asked by At
algorithm fnv-1 is
    hash := FNV_offset_basis

    for each byte_of_data to be hashed do
        hash := hash XOR byte_of_data
        hash := hash × FNV_prime
        hash := hash XOR multiply_carry

    return hash 

Where multiply_carry is the high half of the multiplication result that doesn't fit in the variable hash?

Basically there's two problems with FNV; 1) the tendency to increase the number of set bits due to multiplicative bias, and 2) the low bit problem where the low bit is the XOR of only the low bits of the input.

My experience with lesser hashes (namely those that depend on MOD prime number at the end) says that this reliably fixes the bit bias but I'm not actually sure this works here.

Lifting the code from recursive's answer and expanding out the definition of FNV1a yields:

const uint FnvOffsetBasis32 = 2166136261;
const uint FnvPrime32 = 0x01000193;
long totalPop = 0;

for (long i = 0; i <= uint.MaxValue; i++)
{
    uint hash = FnvOffsetBasis32;
    hash ^= (uint)(i) & 255;
    hash *= FnvPrime32;
    hash ^= (uint)(i >> 8) & 255;
    hash *= FnvPrime32;
    hash ^= (uint)(i >> 16) & 255;
    hash *= FnvPrime32;
    hash ^= (uint)(i >> 24) & 255;
    hash *= FnvPrime32;
    //totalPop += uint.PopCount((uint)i * FnvPrime32);
    totalPop += uint.PopCount(hash);
}

const long totalInts = 0x1_0000_0000;

// Expected population: 68,719,476,736
Console.WriteLine("Expected population: {0:0,0}", 16 * totalInts);

// Actual population: 68,719,435,411
Console.WriteLine("Actual population: {0:0,0}", totalPop);

Note despite the fact we're kicking around C# demo code, this question isn't in C#. The actual usage is for page-aligned hashtables.

2

There are 2 best solutions below

0
Joshua On BEST ANSWER

For the next person coming by here; the immediate answer, is no, this doesn't work.

The bit bias in FNV-1a is real. The bit bias in the low 16 bits is worse than in the upper bits.

However, the proposed fix makes it worse. The simple transform that works wonders on mod-prime hashes just doesn't apply; must be something to with the FNV primes.

I'm still looking for really good ides, but they aren't easy to come by.

3
recursive On

There is no bit bias. I've written some test code in C# to do the multiplication step across all possible values of hash for the 32 bit variant of the function. The cumulative result shows an average of precisely 16 bits set per output across all possible inputs.

const int FnvPrime32 = 0x01000193;
long totalPop = 0;

for (long i = int.MinValue; i <= int.MaxValue; i++) {
    totalPop += int.PopCount((int)i * FnvPrime32);
}

const long totalInts = 0x1_0000_0000;

// Expected population: 68,719,476,736
Console.WriteLine("Expected population: {0:0,0}", 16 * totalInts);

// Actual population: 68,719,476,736
Console.WriteLine("Actual population: {0:0,0}", totalPop);

Since there is no bit-bias in the algorithm, no fix-up could be advantageous.