Improve performance to get min & max datetime for given time range

45 Views Asked by At

I have following timeseries data table,

Id Object_Id Datetime Sensor1 Sensor2
1 175 '2022-03-24 22:01:00' 0 0
2 175 '2022-03-24 22:02:00' 0 0
3 175 '2022-03-24 22:03:00' 5.5699381666667 81.342836833333
4 175 '2022-03-24 22:04:00' 5.5668366666667 81.281143
... ... ... ... ...
48 175 '2022-03-24 22:48:00' NULL NULL
... ... ... ... ...

in above table, I have sensor1, sensor2 upto sensor1000 columns and each has millions of time series records.

Scenario1:

I want to retrieve starttime(min datetime) and endtime(max datetime) records for multiple sensors eg. sensor1 & sensor2.

I have sensors array like below,

sensors = ["Sensor1", "Sensor2", ..., "sensorN"];

I loop through above array and construct below queries for Sensors eg for Sensor1,

(SELECT   datetime,id,sensor1
 FROM     timeseries
 where datetime between '2021-01-20 22:01:00' and '2024-01-20 00:17:00' limit 1)

Union ALL

(SELECT   datetime,id,sensor1
 FROM     timeseries
 where datetime between '2021-01-20 22:01:00' and '2024-01-20 00:17:00' ORDER BY datetime DESC limit 1)

for Sensor2,

(SELECT   datetime,id,sensor2
 FROM     timeseries
 where datetime between '2021-01-20 22:01:00' and '2024-01-20 00:17:00' limit 1)

Union ALL

(SELECT   datetime,id,sensor2
 FROM     timeseries
 where datetime between '2021-01-20 22:01:00' and '2024-01-20 00:17:00' ORDER BY datetime DESC limit 1)

This query works as expected. Observe that I created individual query for sensor1 and sensor2 as I want to handle failure scenario for individual sensor.

The problem is in real life scenario, I have millions of records, around 50 to 70 GB data.

When I run above query (for just sensor1 & sensor2), it takes significant amount of time to return the result. What if user is trying with multiple sensors (for functional requirement I can select up to 35 sensors in one go). Trust me it takes time and we can observe it in real implementation.

In this demo it works smoothly as It has only 150 records.

How can I improve/optimize the performance of this query?

FYI, for each requested sensor, query is prepared in Node/express app and then app connects to MySQL server where I execute each query using for loop:

const queryResults = await Promise.all(
      queries.map(async (query) => {
        return new Promise((resolve, reject) =>
          db.query(query, [], (err, result) => {
            if (err) {

              return resolve([]);                // If individual query fails then return empty array
            } else {
    
              return resolve(result);
            }
          })
        );
      })
    );  
0

There are 0 best solutions below