MySQL Faster Select Query | C# Faster Load DataTable

835 Views Asked by At

I am using MariaDB. I have a table that I create for every IoT device at the time of the first insertion with a stored procedure. If anyone wonders Why I create a new table for every device is devices publish data every 5 seconds and it is impossible for me to store all of them in a single table.

So, my table structure is like below:

CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(
`data_type_id` int(11) DEFAULT NULL, 
`data_value` int(11) DEFAULT NULL,
`inserted_date` DATE DEFAULT NULL, 
`inserted_time` TIME DEFAULT NULL, 
FOREIGN KEY(data_type_id) REFERENCES datatypes(id), 
INDEX `index_mqttpacket`(`data_type_id`,`inserted_date`)) ENGINE = INNODB;

I have a very long SELECT query like below to fetch the data between selected type, date, and time.

SELECT  mqttpacket_123.data_value, datatypes.data_name, datatypes.value_mult,
        CONCAT(mqttpacket_123.inserted_date, ' ',
               mqttpacket_123.inserted_time) AS 'inserted_date_time'
    FROM  mqttpacket_123
    JOIN  datatypes  ON mqttpacket_123.data_type_id = datatypes.id
    WHERE  mqttpacket_123.data_type_id IN(1,2,3,4,5,6)
      AND  CASE WHEN mqttpacket_123.inserted_date = '2021-11-08'
                    THEN mqttpacket_123.inserted_time > '12:25:00'
                WHEN mqttpacket_123.inserted_date = '2021-11-15'
                    THEN mqttpacket_123.inserted_time< '12:25:00'
                ELSE (mqttpacket_123.inserted_date BETWEEN '2021-11-08'
                                                       AND '2021-11-15')
                END;

and this returns around 500k records of the sample below:

|   data_value  |       data_name       |   value_mult  |   inserted_date_time  |
--------------------------------------------------------------------------------
|       271     |       name_1          |       0.1     |   2021-11-08 12:25:04 |
|       106     |       name_2          |       0.1     |   2021-11-08 12:25:04 |
|       66      |       name_3          |       0.1     |   2021-11-08 12:25:04 |
|       285     |       name_4          |       0.1     |   2021-11-08 12:25:04 |
|       61      |       name_5          |       0.1     |   2021-11-08 12:25:04 |
|       454     |       name_6          |       0.1     |   2021-11-08 12:25:04 |
|       299     |       name_7          |       0.1     |   2021-11-08 12:25:04 |

Affected rows: 0 Found rows: 395,332 Warnings: 0 Duration for 1 query: 0.734 sec. (+ 7.547 sec. network)

I keep only the last 2 weeks' data in my tables and clean up the previous data as I have a backup system.

However, Loading the query result to DataTable also takes ~30sec. which is 4 times slower than MySQL.

Do you have any suggestions to improve this performance?

PS. I call this query from C# by the following statement in a Stored Procedure of RunQuery which takes the query and performs it as it is.

public DataTable CallStoredProcedureRunQuery(string QueryString)
        {
            DataTable dt = new DataTable();

            try
            {
                using (var conn = new MySqlConnection(_connectionString))
                {
                    conn.Open();
                    using (var cmd = new MySqlCommand("SP_RunQuery", conn))
                    {
                        cmd.CommandType = CommandType.StoredProcedure;

                        cmd.Parameters.Add("@query_string", MySqlDbType.VarChar).Value = QueryString; 

                        using (MySqlDataAdapter sda = new MySqlDataAdapter(cmd))
                        {
                            sda.Fill(dt);
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                IoTemplariLogger.tLogger.EXC("Call Stored Procedure for RunQuery failed.", ex);
            }
            return dt;
        }

EDIT: My sensors push a single MQTT packet which contains ~50 different data. There are 12 times 5seconds in a minute. So, basically, I receive ~600 rows per minute per device.

Data insertion is done in a Stored Procedure async. I push the JSON content along with the device_id and I iterate on the JSON to parse and insert into the table.

PS. The following code is just for clarification. It works fine.

/*Dynamic SQL -- IF they are registered to the system but have notable, create it.*/
    SET create_table_query = CONCAT('CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(`data_type_id` int(11) DEFAULT NULL, `data_value` int(11) DEFAULT NULL,`inserted_date` DATE DEFAULT NULL, `inserted_time` TIME DEFAULT NULL, FOREIGN KEY(data_type_id) REFERENCES datatypes(id), INDEX `index_mqttpacket`(`data_type_id`,`inserted_date`)) ENGINE = InnoDB;');
    PREPARE stmt FROM create_table_query;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
    
    /*Loop into coming value array. It is like: $.type_1,$.type_2,$.type_3, to iterate in the JSON. We reach each value like $.type_1*/
    WHILE (LOCATE(',', value_array) > 0)
    DO
        SET arr_data_type_name = SUBSTRING_INDEX(value_array,',',1); /*pick first item of value array*/
        SET value_array = SUBSTRING(value_array, LOCATE(',',value_array) + 1); /*remove picked first item from the value_array*/
        
        SELECT JSON_EXTRACT(incoming_data, arr_data_type_name) INTO value_iteration; /*extract value of first item. $.type_1*/
        
        SET arr_data_type_name := SUBSTRING_INDEX(arr_data_type_name, ".", -1); /*Remove the $ and the . to get pure data type name*/
        
        /*Check the data type name exists or not in the table, if not insert and assign it's id to lcl_data_type_id*/
            IF (SELECT COUNT(id) FROM datatypes WHERE datatypes.data_name = arr_data_type_name) > 0 THEN 
                SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
            ELSE
                SELECT devices.device_type_id INTO lcl_device_type FROM devices WHERE devices.id = lcl_device_id LIMIT 1;
                INSERT INTO datatypes (datatypes.data_name,datatypes.description,datatypes.device_type_id,datatypes.value_mult ,datatypes.inserted_time) VALUES(arr_data_type_name,arr_data_type_name,lcl_device_type,0.1,NOW());
                SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
            END IF;
            
            /*To retrieve the table of which device has which datatypes inserted, this is to not to retrieve the datatypes unneccesseraly for the selected device*/
            IF (SELECT COUNT(device_id) FROM devicedatatypes WHERE devicedatatypes.device_id = lcl_device_id AND devicedatatypes.datatype_id = lcl_data_type_id) < 1 THEN
                INSERT INTO devicedatatypes (devicedatatypes.device_id, devicedatatypes.datatype_id) VALUES(lcl_device_id,lcl_data_type_id);
            END IF;
            
            SET lcl_insert_mqtt_query = CONCAT('INSERT INTO mqttpacket_',device_serial_number,'(data_type_id,data_value,inserted_date,inserted_time) VALUES(',lcl_data_type_id,',',value_iteration,',''',data_date,''',''',data_time,''');');
            PREPARE stmt FROM lcl_insert_mqtt_query;
            EXECUTE stmt;
            SET affected_data_row_count = affected_data_row_count + 1;
                        
    END WHILE;

Here and here are also extra information that can be found of the server and database regarding the comments. I have an SSD on the server. There is nothing important else that works other than my dotnet application and database.

2

There are 2 best solutions below

11
Rick James On BEST ANSWER
  • It is usually better to have a DATETIME column instead of splitting it into two (DATE and TIME) columns. That might simplify the WHERE clause.

  • Having one table per device is usually a bad idea. Instead, add a column for the device_id.

  • Not having a PRIMARY KEY is a bad idea. Do you ever get two readings in the same second for a specific device? Probably not.

Rolling those together plus some other likely changes, start by changing the table to

CREATE TABLE IF NOT EXISTS `mqttpacket`(
`device_serial_number` SMALLINT UNSIGNED NOT NULL,
`data_type_id` TINYINT UNSIGNED NOT NULL,
`data_value` SMALLINT NOT NULL,
`inserted_at` DATETIME NOT NULL, 
FOREIGN KEY(data_type_id) REFERENCES datatypes(id), 
PRIMARY KEY(device_serial_number, `data_type_id`,`inserted_at`)
) ENGINE = INNODB;

That PK will make the query faster.

This may be what you are looking for after the change to DATETIME:

  AND inserted_at >= '2021-11-08 12:25:00'
  AND inserted_at  < '2021-11-08 12:25:00' + INTERVAL 7 DAY

To keep 2 weeks' worth of data, DROP PARTITION is an efficient way to do the delete. I would use PARTITION BY RANGE(TO_DAYS(inserted_at)) and have 16 partitions, as discussed in http://mysql.rjweb.org/doc.php/partitionmaint

If you are inserting a thousand rows every 5 seconds -- With table-per-device, you would need a thousand threads each doing one insert. This would be a nightmare for the architecture. With a single table (as I suggest), and if you can get the 1000 rows together in a process at the same, time, do one multi-row INSERT every 5 seconds. I discuss other high speed ingestion.

4
Wilson Hauck On

Rate Per Second = RPS

Suggestions to consider for your instance [mysqld] section

innodb_io_capacity=500  # from 200 to use more of available SSD IOPS
innodb_log_file_size=256M  # from 48M to reduce log rotation frequency
innodb_log_buffer_size=128M  # from 16M to reduce log rotation avg 25 minutes
innodb_lru_scan_depth=100  # from 1024 to conserve 90% CPU cycles used for function
innodb_buffer_pool_size=10G  # from 128M to reduce innodb_data_reads 85 RPS
innodb_change_buffer_max_size=50  # from 25 percent to expedite pages created 590 RPhr

Observation, innodb_flush_method=O_DIRECT # from fsync for method typically used on LX systems

You should find these significantly improve task completion performance. View profile for free downloadable Utility Scripts to assist with performance tuning.
There are additional opportunities to tune Global Variables.