I am trying to implement a data structure to parse data and do data analysis
My data format is as follows
DATE-TIME DATA DATA DATA
31/3/1999 20:40 6 130 19.95
31/3/1999 23:50 3 440 17.95
31/3/1999 23:20 4 300 18.81
31/3/1999 23:30 5 313 18.48
20/7/1999 23:40 20 1000 18.19
31/7/1999 23:40 5 110 18.23
25/5/2023 23:40 0 130 26
22/2/2023 23:20 2 110 33.3
15/2/2023 23:30 3 110 22.2
20/2/2023 23:40 12 110 30.1
31/3/2023 23:40 34 100 0
The requirements is that will have to access the data either by year OR by month for each year.
Usage of MAPS and BST are a MUST, vector is not necessary.
I have already implemeted an algorithm with MAP and BST so currently its like
MAP[1999] = BST<THE DATA RELATED TO 1999>
MAP[2023] = BST<THE DATA RELATED TO 2023>
The issue I am facing is if i want to get example JULY i have to go through the whole tree to find where data.Month == june
I have tried going with something like that
MAP[2023] = BST< VECTOR <JAN> <FEB> <MAR> <APR> <MAY> <JUN> <JUL> >
My rational is that I will store a vector of each month of record inside the BST. But I realised that the BST wouldnt know what record it is and how the store the vector properly. Another issue is that I wouldnt have a way to properly access the MONTH vector from the BST.
Using a map or multiple data structures in this case is not necessary at all. The structure of the data is very predictable so making a
vector<monthly_data>is good enough.Because you know which year and month the first entry in the vector is, calculating the range you are interested in is trivial.
To fetch the data related to a specific year you just calculate the offset to January of that year and retrieve a pointer to that data. The following 11 entries will be the rest of the data that year.
To fetch the data related to a specific month every year you just calculate the offset to the first entry of that month and retrieve a pointer to that data. Every 12th entry after that entry will be data related to that month in the following years.
Note that the above code is just a conceptual example and has not been tested. I have also omitted checks for the edge cases.