I have a large dataset that I am trying to quality control. The parameter I am currently working on is salinity. I am trying to write a code that will apply a flag if the salinity drops by a certain value (say > 1.0 PSU) and returns to normal on the next line of data. My dataset looks something like this:
| datetime | Salinity (PSU) | flag_sal |
|---|---|---|
| 2017-09-01 10:30 | 33.50 | 0 |
| 2017-09-01 10:34 | 32.00 | 0 |
| 2017-09-01 10:36 | 33.50 | 0 |
In the second row I want to change the 0 to a 1.
As you can see, the second row the salinity drops, but the next line is normal again. I want to change the flag column to 1 instead of 0. My dataset is 108106 lines so it would be nice to write something that can go through the whole dataset and apply flags in the 'flag_sal' column to all these 'bad' data points.
Thanks!
To apply these flags I have been plotting using plotly to zoom in and find all the bad data points and it is very time consuming!