I'm trying to find duplicate transmissions in our database, in this case, to find where a transmitter transmitted the same data multiple times in a row (3 or more)
I've tried this query, which gets me close, but returns one record per period of duplicate readings per transmitter, when I only want the most recent duplicate reading period
select transmitter_id,
count(*) as number_of_duplicate_readings,
total_reading,
max(transmissions.created_at) as latest_duplicate_reading
from transmissions
group by transmitter_id, total_reading
having count(*) > 3
order by latest_duplicate_reading desc;
+--------------+----------------------------+-------------+------------------------+
|transmitter_id|number_of_duplicate_readings|total_reading|latest_duplicate_reading|
+--------------+----------------------------+-------------+------------------------+
|205 |50 |0 |2020-08-17 13:44:03 |
|204 |27 |0 |2020-08-17 13:44:01 |
|202 |33 |0 |2020-08-17 13:43:56 |
|200 |27 |0 |2020-08-17 13:43:52 |
|197 |27 |0 |2020-08-17 13:43:45 |
|196 |42 |0 |2020-08-17 13:43:43 |
|194 |43 |0 |2020-08-17 13:43:39 |
|193 |11 |0 |2020-08-17 13:43:36 |
|192 |27 |0 |2020-08-17 13:43:35 |
|505 |37 |830 |2020-08-05 23:59:28 |
+--------------+----------------------------+-------------+------------------------+
If I order by transmitter ID, the issue becomes aparent:
+--------------+----------------------------+-------------+------------------------+
|transmitter_id|number_of_duplicate_readings|total_reading|latest_duplicate_reading|
+--------------+----------------------------+-------------+------------------------+
|2 |25 |300 |2019-09-25 23:09:04 |
|2 |4 |310 |2019-09-29 23:18:58 |
|2 |5 |320 |2019-10-04 23:25:27 |
|3 |7 |15670 |2020-02-06 23:25:14 |
|3 |4 |16990 |2020-02-24 23:08:26 |
|3 |24 |19810 |2020-04-13 23:30:59 |
|3 |15 |19830 |2020-04-28 23:50:35 |
|3 |12 |20160 |2020-05-19 23:33:27 |
|3 |8 |20170 |2020-05-27 15:54:51 |
|3 |6 |20180 |2020-06-21 23:45:22 |
+--------------+----------------------------+-------------+------------------------+
Does anyone know of a way to accomplish what I want in MySQL / MariaDB, or will I need to filter this further in other code to accomplish what I want?
Thanks!
With MySQL 5.1 you have no analytic function (
MAX OVERetc.) available, nor even aWITHclause. The only option I see there is copy & paste. Use about the same query twice, once to find all duplicates, once to check for later duplicates.