Database (Postgres) indexing columns containing regex patterns

246 Views Asked by At

Consider a table that stores regular expression patters.

One could query such table passing a given text for records containing patters which match given text.

This can be achieved using inverse regexp match operator ~ (by inverse I mean that text value comes first and then we specify a field containing regexp pattern like in the following example:

DROP TABLE IF EXISTS public.patterntable;
CREATE TABLE IF NOT EXISTS public.patterntable
(
    id bigint NOT NULL,
    pattern text COLLATE pg_catalog."default" NOT NULL
);

INSERT INTO patterntable (id, pattern) VALUES (1, '.*');
INSERT INTO patterntable (id, pattern) VALUES (2, '^dog');
INSERT INTO patterntable (id, pattern) VALUES (3, 'dog$');

SELECT * FROM patterntable WHERE 'x' ~ pattern;

In order to get the results the database engine runs a sequential scan which might be costly - the table can contain lots of records + many fields storing such regex patterns

My question: is there a way to index columns storing regex patterns for such lookups.

  1. in postgres (I am using last version of postgres (15.1))
  2. in any other database engine that is capable of indexing regexp patterns
1

There are 1 best solutions below

4
Erwin Brandstetter On

There is no way to index patterns like that in Postgres. And I seriously doubt there is one in any RDBMS.

An index for the reverse operation, i.e. the common case of indexing strings to be searched, is easily possible with a trigram index, at least for basic patterns. See:

Indexes only work with immutable values. That is not possible for patterns that have to be evaluated against the concrete string to determine whether it matches.