Top N Records Per Group Ordered By Created_At

98 Views Asked by At

I have a table of users, posts, and subscriptions, where users can subscribe to other users and see their most recent posts.

This is the query:

SELECT "blog_posts".* 
FROM "blog_posts" 
WHERE (user_id IN (SELECT subscribed_to_id FROM subscriptions WHERE subscriber_id = ?)) 
ORDER BY "blog_posts"."created_at" LIMIT 50

The query is grabbing the users that the current user is subscribed to (subscribed_to_id) and then loading up the 50 most recent posts across those users.

This seems like a variation on the greatest n per group problem, but I haven't had much luck and feel like I must be missing something obvious.

Schema:

Posts:

CREATE TABLE blog_posts (
    id bigint DEFAULT nextval('blog_posts_id_seq'::regclass) PRIMARY KEY,
    message character varying(280) NOT NULL,
    user_id bigint NOT NULL REFERENCES users(id),
    created_at timestamp(6) without time zone NOT NULL,
    updated_at timestamp(6) without time zone NOT NULL
);

-- Indices -------------------------------------------------------

CREATE UNIQUE INDEX blog_posts_pkey ON blog_posts(id int8_ops);
CREATE INDEX index_posts_on_user_id ON blog_posts(user_id int8_ops);
CREATE INDEX idx_posts_user_created ON blog_posts(user_id int8_ops,created_at timestamp_ops DESC);created_at timestamp_ops DESC);

Users:

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    email text NOT NULL,
    created_at timestamp(6) without time zone NOT NULL,
    updated_at timestamp(6) without time zone NOT NULL
);

-- Indices -------------------------------------------------------

CREATE UNIQUE INDEX users_pkey ON users(id int8_ops);

Subscriptions:

CREATE TABLE subscriptions (
    id BIGSERIAL PRIMARY KEY,
    subscriber_id bigint REFERENCES users(id),
    subscribed_to_id bigint REFERENCES users(id),
    created_at timestamp(6) without time zone NOT NULL,
    updated_at timestamp(6) without time zone NOT NULL
);

-- Indices -------------------------------------------------------

CREATE UNIQUE INDEX subscriptions_pkey ON subscriptions(id int8_ops);
CREATE INDEX index_subscriptions_on_subscribed_to_id ON subscriptions(subscribed_to_id int8_ops);
CREATE UNIQUE INDEX index_subscriptions_on_subscriber_id_and_subscribed_to_id ON subscriptions(subscriber_id int8_ops,subscribed_to_id int8_ops);
CREATE INDEX index_subscriptions_on_subscriber_id ON subscriptions(subscriber_id int8_ops);

As the number of posts grows this query becomes slow. Right now in my test database I have 10,000 users, and each user has 4,000 posts. When a user is subscribed to 5 other users, this means the query planner will load up all posts for each user using my index_posts_on_user_id index, and then do the ORDER BY in memory to get the most recent 50 results.

As the number of posts per user grows, or the number of subscriptions grows, this will continue to get slower.

Here's an example query plan: https://explain.dalibo.com/plan/gad3g247adbc7gf5

Limit  (cost=74084.78..74084.90 rows=50 width=57) (actual time=119.681..119.696 rows=50 loops=1)
  Buffers: shared hit=16 read=28029 written=3
  ->  Sort  (cost=74084.78..74134.70 rows=19971 width=57) (actual time=119.679..119.687 rows=50 loops=1)
          Sort Key: blog_posts.created_at
          Sort Method: top-N heapsort Memory: 37kB
        Buffers: shared hit=16 read=28029 written=3
        ->  Nested Loop  (cost=51.68..73421.35 rows=19971 width=57) (actual time=1.389..112.820 rows=28000 loops=1)
              Buffers: shared hit=16 read=28029 written=3
              ->  Index Only Scan using index_subscriptions_on_subscriber_id_and_subscribed_to_id on subscriptions subscriptions  (cost=0.29..4.38 rows=5 width=8) (actual time=0.535..0.549 rows=7 loops=1)
                      Index Cond: (subscriptions.subscriber_id = 9999)
                    Buffers: shared hit=1 read=2

              ->  Bitmap Heap Scan on blog_posts blog_posts  (cost=51.39..14643.46 rows=3994 width=57) (actual time=1.295..15.097 rows=4000 loops=7)
                      Recheck Cond: (blog_posts.user_id = subscriptions.subscribed_to_id)
                      Heap Blocks: exact=28000
                    Buffers: shared hit=15 read=28027 written=3
                    ->  Bitmap Index Scan on index_posts_on_user_id  (cost=0.00..50.39 rows=3994 width=0) (actual time=0.693..0.693 rows=4000 loops=7)
                            Index Cond: (blog_posts.user_id = subscriptions.subscribed_to_id)
                          Buffers: shared hit=7 read=35

Planning:
  Buffers: shared hit=24 read=14
Execution time: 119.728 ms

For the user in question, there were 7 users subscribed to, so it loaded 28,000 rows in the Bitmap Index Scan step.

I tried expanding the index to include created_at but that isn't used as Postgres still needs to load all the rows from each user to determine which ones are newest.

I then tried using a CROSS JOIN:

SELECT b.*
FROM   subscriptions s
CROSS JOIN LATERAL (
   SELECT *
   FROM blog_posts b
   WHERE  s.subscriber_id = 100 
   AND b.user_id = s.subscribed_to_id
   ORDER  BY created_at DESC LIMIT 50
   ) b
ORDER BY created_at DESC LIMIT 50

Plan: https://explain.dalibo.com/plan/d0c7789a11hbh434

Limit  (cost=10163257.73..10163257.85 rows=50 width=57) (actual time=31.224..31.238 rows=50 loops=1)
  Buffers: shared hit=741
  ->  Sort  (cost=10163257.73..10169483.10 rows=2490150 width=57) (actual time=31.222..31.230 rows=50 loops=1)
          Sort Key: b.created_at DESC
          Sort Method: top-N heapsort Memory: 35kB
        Buffers: shared hit=741
        ->  Nested Loop  (cost=0.56..10080536.74 rows=2490150 width=57) (actual time=0.296..31.139 rows=300 loops=1)
              Buffers: shared hit=741
              ->  Seq Scan on subscriptions s  (cost=0.00..914.03 rows=49803 width=16) (actual time=0.007..4.748 rows=49803 loops=1)
                    Buffers: shared hit=416
              ->  Limit  (cost=0.56..201.39 rows=50 width=57) (actual time=0.000..0.000 rows=0 loops=49803)
                    Buffers: shared hit=325
                    ->  Result  (cost=0.56..16042.46 rows=3994 width=57) (actual time=0.000..0.000 rows=0 loops=49803)
                          Buffers: shared hit=325
                          ->  Index Scan using idx_posts_user_created on blog_posts b  (cost=0.56..16042.46 rows=3994 width=57) (actual time=0.007..0.041 rows=50 loops=6)
                                  Index Cond: (b.user_id = s.subscribed_to_id)
                                Buffers: shared hit=325
Execution time: 31.267 ms

This decreases the number of rows read from the blog posts table and seems quite a bit faster, but it loads ~50k rows from the subscriptions table instead, and has a very high cost associated with it.

Is there a way to optimize this solely with SQL so that the number of rows read remains somewhat bounded while still getting the most recent 50 posts?

1

There are 1 best solutions below

0
jjanes On

You specified 'solely with SQL', and I think the answer to that is probably 'no'. There is an optimization that can work, but just doing the lateral join won't access it as it requires a more verbose way of writing the query. You would need to first run the query from the subscriptions table, then use the result to construct something like this:

SELECT "blog_posts".* FROM (
   select * from "blog_posts" where user_id=17 order by created_at limit 50
   union_all 
   select * from "blog_posts" where user_id=23 order by created_at limit 50
   union_all
   select * from "blog_posts" where user_id=42 order by created_at limit 50
   -- union all ... so on
) order by created_at limit 50

This would require an index consisting of (or leading with) columns (user_id, created_at), and it would execute each branch of the UNION ALL in an interleaved fashion and "merge append" the results, stopping as soon as it obtained the needed number of rows.

You could define a set-returning function which would run the "inner" query and use the results to dynamically concatenate up this monster query (using the format() function can help do that safely) and then execute that with dynamic SQL and return the result. It might be your best solution, but it does kind of seem like cheating to call that solely SQL.

For the last plan you linked to in the comments, you could probably improve that further by getting the final index scan to be an index-only scan. Which would entail selecting only the columns you need (not using *) and then including all of those columns in the index. But it would still not scale as well as the more complicated dynamically-constructed query, and it might be awkward to have an index with so many columns in an index and to kept the table's visibility map tuned up enough to be effective.