How can function to find Hamming Distance be accelerated for bigger datas in postgreSQL?

99 Views Asked by sebin At 14 November 2022 at 10:36

I have a potgreSQL data bank with more than 10,0000 entries and each entry has a bit array of size 10000. Is there any method to accelerate the Hamming Distance calculation of the bit arrays for the whole table. Thanks

i tried using different data types like bytea, text and numeric for saving bit array and for calculating hamming distance i tried XOR gate operations, text comparison and numeric addition respectively for each datatypes. But i could not optimize the function to make it super quick, currently it takes almost 2 sec for the operation. The target is 200 millisecond.

Original Q&A

There are 1 best solutions below

SQLpro On 05 December 2022 at 18:25

There is no possibilities to have good performances for hamming distance because it's a recursive process with a high algorithmic complexity and a very high memory footprint. https://www.cs.swarthmore.edu/~brody/papers/random14-hamming-distance.pdf

It is not accurate to use it in some big datasets like RDBMS.

Some other comparing technics exists and have a lower complexity withour recursive process and with a minimal footprint... They are not as accurate as the Hamming Distance, but can do a good job, as the one I wrote :

See "inférence basique"

You can combine the two... First use inférence basique to reduce the set, second use hamming on some very few results...

How can function to find Hamming Distance be accelerated for bigger datas in postgreSQL?

There are 1 best solutions below

Related Questions in POSTGRESQL

Related Questions in MULTIPROCESSING

Related Questions in HAMMING-DISTANCE

Trending Questions

Popular # Hahtags

Popular Questions