I am using MariaDB. I have a table that I create for every IoT device at the time of the first insertion with a stored procedure. If anyone wonders Why I create a new table for every device is devices publish data every 5 seconds and it is impossible for me to store all of them in a single table.
So, my table structure is like below:
CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(
`data_type_id` int(11) DEFAULT NULL,
`data_value` int(11) DEFAULT NULL,
`inserted_date` DATE DEFAULT NULL,
`inserted_time` TIME DEFAULT NULL,
FOREIGN KEY(data_type_id) REFERENCES datatypes(id),
INDEX `index_mqttpacket`(`data_type_id`,`inserted_date`)) ENGINE = INNODB;
I have a very long SELECT query like below to fetch the data between selected type, date, and time.
SELECT mqttpacket_123.data_value, datatypes.data_name, datatypes.value_mult,
CONCAT(mqttpacket_123.inserted_date, ' ',
mqttpacket_123.inserted_time) AS 'inserted_date_time'
FROM mqttpacket_123
JOIN datatypes ON mqttpacket_123.data_type_id = datatypes.id
WHERE mqttpacket_123.data_type_id IN(1,2,3,4,5,6)
AND CASE WHEN mqttpacket_123.inserted_date = '2021-11-08'
THEN mqttpacket_123.inserted_time > '12:25:00'
WHEN mqttpacket_123.inserted_date = '2021-11-15'
THEN mqttpacket_123.inserted_time< '12:25:00'
ELSE (mqttpacket_123.inserted_date BETWEEN '2021-11-08'
AND '2021-11-15')
END;
and this returns around 500k records of the sample below:
| data_value | data_name | value_mult | inserted_date_time |
--------------------------------------------------------------------------------
| 271 | name_1 | 0.1 | 2021-11-08 12:25:04 |
| 106 | name_2 | 0.1 | 2021-11-08 12:25:04 |
| 66 | name_3 | 0.1 | 2021-11-08 12:25:04 |
| 285 | name_4 | 0.1 | 2021-11-08 12:25:04 |
| 61 | name_5 | 0.1 | 2021-11-08 12:25:04 |
| 454 | name_6 | 0.1 | 2021-11-08 12:25:04 |
| 299 | name_7 | 0.1 | 2021-11-08 12:25:04 |
Affected rows: 0 Found rows: 395,332 Warnings: 0 Duration for 1 query: 0.734 sec. (+ 7.547 sec. network)
I keep only the last 2 weeks' data in my tables and clean up the previous data as I have a backup system.
However, Loading the query result to DataTable also takes ~30sec. which is 4 times slower than MySQL.
Do you have any suggestions to improve this performance?
PS. I call this query from C# by the following statement in a Stored Procedure of RunQuery which takes the query and performs it as it is.
public DataTable CallStoredProcedureRunQuery(string QueryString)
{
DataTable dt = new DataTable();
try
{
using (var conn = new MySqlConnection(_connectionString))
{
conn.Open();
using (var cmd = new MySqlCommand("SP_RunQuery", conn))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add("@query_string", MySqlDbType.VarChar).Value = QueryString;
using (MySqlDataAdapter sda = new MySqlDataAdapter(cmd))
{
sda.Fill(dt);
}
}
}
}
catch (Exception ex)
{
IoTemplariLogger.tLogger.EXC("Call Stored Procedure for RunQuery failed.", ex);
}
return dt;
}
EDIT: My sensors push a single MQTT packet which contains ~50 different data. There are 12 times
5secondsin a minute. So, basically, I receive ~600 rows per minute per device.
Data insertion is done in a Stored Procedure async. I push the JSON content along with the device_id and I iterate on the JSON to parse and insert into the table.
PS. The following code is just for clarification. It works fine.
/*Dynamic SQL -- IF they are registered to the system but have notable, create it.*/
SET create_table_query = CONCAT('CREATE TABLE IF NOT EXISTS `mqttpacket_',device_serial_number,'`(`data_type_id` int(11) DEFAULT NULL, `data_value` int(11) DEFAULT NULL,`inserted_date` DATE DEFAULT NULL, `inserted_time` TIME DEFAULT NULL, FOREIGN KEY(data_type_id) REFERENCES datatypes(id), INDEX `index_mqttpacket`(`data_type_id`,`inserted_date`)) ENGINE = InnoDB;');
PREPARE stmt FROM create_table_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
/*Loop into coming value array. It is like: $.type_1,$.type_2,$.type_3, to iterate in the JSON. We reach each value like $.type_1*/
WHILE (LOCATE(',', value_array) > 0)
DO
SET arr_data_type_name = SUBSTRING_INDEX(value_array,',',1); /*pick first item of value array*/
SET value_array = SUBSTRING(value_array, LOCATE(',',value_array) + 1); /*remove picked first item from the value_array*/
SELECT JSON_EXTRACT(incoming_data, arr_data_type_name) INTO value_iteration; /*extract value of first item. $.type_1*/
SET arr_data_type_name := SUBSTRING_INDEX(arr_data_type_name, ".", -1); /*Remove the $ and the . to get pure data type name*/
/*Check the data type name exists or not in the table, if not insert and assign it's id to lcl_data_type_id*/
IF (SELECT COUNT(id) FROM datatypes WHERE datatypes.data_name = arr_data_type_name) > 0 THEN
SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
ELSE
SELECT devices.device_type_id INTO lcl_device_type FROM devices WHERE devices.id = lcl_device_id LIMIT 1;
INSERT INTO datatypes (datatypes.data_name,datatypes.description,datatypes.device_type_id,datatypes.value_mult ,datatypes.inserted_time) VALUES(arr_data_type_name,arr_data_type_name,lcl_device_type,0.1,NOW());
SELECT id INTO lcl_data_type_id FROM datatypes WHERE datatypes.data_name = arr_data_type_name LIMIT 1;
END IF;
/*To retrieve the table of which device has which datatypes inserted, this is to not to retrieve the datatypes unneccesseraly for the selected device*/
IF (SELECT COUNT(device_id) FROM devicedatatypes WHERE devicedatatypes.device_id = lcl_device_id AND devicedatatypes.datatype_id = lcl_data_type_id) < 1 THEN
INSERT INTO devicedatatypes (devicedatatypes.device_id, devicedatatypes.datatype_id) VALUES(lcl_device_id,lcl_data_type_id);
END IF;
SET lcl_insert_mqtt_query = CONCAT('INSERT INTO mqttpacket_',device_serial_number,'(data_type_id,data_value,inserted_date,inserted_time) VALUES(',lcl_data_type_id,',',value_iteration,',''',data_date,''',''',data_time,''');');
PREPARE stmt FROM lcl_insert_mqtt_query;
EXECUTE stmt;
SET affected_data_row_count = affected_data_row_count + 1;
END WHILE;
Here and here are also extra information that can be found of the server and database regarding the comments.
I have an SSD on the server. There is nothing important else that works other than my dotnet application and database.
It is usually better to have a
DATETIMEcolumn instead of splitting it into two (DATEandTIME) columns. That might simplify theWHEREclause.Having one table per device is usually a bad idea. Instead, add a column for the device_id.
Not having a
PRIMARY KEYis a bad idea. Do you ever get two readings in the same second for a specific device? Probably not.Rolling those together plus some other likely changes, start by changing the table to
That PK will make the query faster.
This may be what you are looking for after the change to
DATETIME:To keep 2 weeks' worth of data,
DROP PARTITIONis an efficient way to do the delete. I would usePARTITION BY RANGE(TO_DAYS(inserted_at))and have 16 partitions, as discussed in http://mysql.rjweb.org/doc.php/partitionmaintIf you are inserting a thousand rows every 5 seconds -- With table-per-device, you would need a thousand threads each doing one insert. This would be a nightmare for the architecture. With a single table (as I suggest), and if you can get the 1000 rows together in a process at the same, time, do one multi-row
INSERTevery 5 seconds. I discuss other high speed ingestion.