I dont understand why the interger Value "hash" is getting lower in/after the 3 loop.
I would guess this happen because the uint limitation is 2,147,483,647.
BUT... when i try to go step by step the value is equal to 2146134658?.
I´m not that good in math but it should be lower than the limitation.
#define FNV_PRIME_32 16777619
#define FNV_OFFSET_32 2166136261U
unsigned int hash_function(const char *string, unsigned int size)
{
unsigned int str_len = strlen(string);
if (str_len == 0) exit(0);
unsigned int hash = FNV_OFFSET_32;
for (unsigned int i = 0; i < str_len; i++)
{
hash = hash ^ string[i];
// Multiply by prime number found to work well
hash = hash * FNV_PRIME_32;
if (hash > 765010506)
printf("YO!\n");
else
printf("NOO!\n");
}
return hash % size;
}
If you are wondering this if statement is only for me.
if (hash > 765010506)
printf("YO!\n");
else
printf("NOO!\n");
765010506 is the value for hash after the next run through the loop.
All unsigned integer arithmetic in C is modular arithmetic. For
unsigned int, it is moduloUINT_MAX + 1; forunsigned long, moduloULONG_MAX + 1, and so on.(
amodulommeans the remainder ofadivided bym; in C,a % mif bothaandmare unsigned integer types.)On many current architectures,
unsigned intis a 32-bit unsigned integer type, withUINT_MAX == 4294967295.Let's look at what this means in practice, for multiplication (by 65520, which happens to be an interesting value; 216 - 16):
The output is
Sure it can. In fact, you can show mathematically that it happens eventually whenever the multiplier is even, and the modulo is with respect to a power of two (232, here).
Your particular multiplier is odd, however; so, it does not suffer from the above. However, it still wraps around due to the modulo operation. If we retry the same with your multiplier,
16777619, and a bit longer sequence,we get
In fact, it turns out that this sequence is 1,073,741,824 iterations long (before it repeats itself), and will never yield 0, 2, 4, 5, 6, 7, 8, 10, 12, 13, 14, or 15, for example -- that is, if it starts from 1. It even takes 380 iterations to get a result smaller than 16,777,619 (16,689,137).
For a hash function, that is okay. Each new nonzero input changes the state, so the sequence is not "locked". But, there is no reason to expect the hash value increases monotonically as the length of the hashed data increases; it is much better to assume it is "roughly random" instead: not really random, as it depends on the input only, but also not obviously regular-looking.