postgres: partial refresh on materialized view?

337 Views Asked by At

I have this materialized view:

                                   Materialized view "public.kwh_tag"
   Column    |       Type       | Collation | Nullable | Default | Storage | Stats target | Description
-------------+------------------+-----------+----------+---------+---------+--------------+-------------
 timest      | date             |           |          |         | plain   |              |
 Zählerstand | double precision |           |          |         | plain   |              |
 kWh         | double precision |           |          |         | plain   |              |
Indexes:
    "kwh_tag_timest_idx" UNIQUE, btree (timest)
View definition:
 SELECT s.timest::date AS timest,
    s.kwh_180 AS "Zählerstand",
    s.kwh_180 - lag(s.kwh_180, 1) OVER (ORDER BY s.timest) AS "kWh"
   FROM ( SELECT DISTINCT ON ((smartmeter.timest::date)) smartmeter.timest,
            smartmeter.kwh_180
           FROM smartmeter
          ORDER BY (smartmeter.timest::date) DESC, smartmeter.timest DESC) s;

Output:

 2023-06-10 |  35965.0285 | 22.8133000000016
 2023-06-11 |   35985.919 | 20.8905000000013
 2023-06-12 |  36012.7307 | 26.8116999999984
 2023-06-13 |  36030.2164 | 17.4856999999975

The reason is, that this query takes a lot of time, as the table smartmeter stores data every second (or 2 if the device takes a bit longer).

REFRESH MATERIALIZED VIEW
Time: 94290.866 ms (01:34.291)

However, with each day the refresh of the view also takes more time, is there some better way to handle this?

I tried using CONCURRENTLY but this takes even longer, although only a little bit:

REFRESH MATERIALIZED VIEW CONCURRENTLY kwh_tag;
REFRESH MATERIALIZED VIEW
Time: 111461.192 ms (01:51.461)

Ideally only the most recent day needs to be computed, as the old values are not going to change. I could do that outside of postgres and save it back in, a in-postgres solution would be nicer however.

2

There are 2 best solutions below

0
jonatasdp On

One alternative for this scenario is use timescaledb extension with the continuous aggregates feature.

You can create a continuous aggregation policy that will run only in the fresh data as it uses a watermark to know what was the latest run.

0
Laurenz Albe On

Instead of a materialized view, you could create a partitioned table. Each partition is for the values in a particular time range. Every day, you would use TRUNCATE and INSERT INTO ... SELECT ... to rebuild the latest partition.