Time Complexity and Better approach of coding

72 Views Asked by At

I have a problem, input will contain loggedin user id and timestamp when user logged in. when user logs in again need to find out no of times user logged in last x seconds.

input = {(P1, 0), (P2, 1), (P3, 2), (P1,3), (P1,4), (P2,5), (P1,6)}

Q1: For last 4 seconds for P1 need to output how many times user has logged in.

output: P1,2

Q2: In last 6 seconds for P1:

output: P1,4

To solve this, I have initially used a hashMap, with keys as person id and values as set of timestamps for each person.

Map<personID,set<TimeStamp>> personMap = new HashMap();

personMap.put(P1,Treeset(0,3,4));
personMap.put(P2,Treeset(1,5));
personMap.put(P3,Treeset(2));

then on the final entry, we retrive the treeset and find values that fall after last 4 seconds from given time. This is working, but I am confused on timecomplexity, what will be timecomplexity for inserting and retrival considering values are Treesets?

Is there any better datastructure of storing timestamps in personMap?

2

There are 2 best solutions below

2
AudioBubble On

The bottom layer of Treeset is red-black tree, and the time complexity of query is O (log n),

Compared with HashSet, TreeSet has slightly lower performance. The time complexity of insert query and other operations is O (log n).

HashMap is recommended.

0
Nicolas125841 On

Time Complexity:

tldr: I'm pretty sure this takes O(n) per query and O(log(n)) per insert.

For each query, you initially use log(n) on the TreeSet to select the right user.

Once you have the user, it sounds like you are finding the first timestamp within your bound (again can be log(k)) and iterating through the rest in your bound. This part becomes O(k) because you could query for a very long time back and end up iterating through your whole set.

Assuming your queries are random, the time complexity looks something like O(log(N) + K/2 + log(K)) -> O(K/2) -> O(n) (K is the set size for the user size, N is user count).

Solution:

tldr: Use a prefix sum.

The one way I can think of optimizing this from O(K) into O(log(K)) would be to keep track of the total number of logins performed by the user and store that with each timestamp.

Something like:

P1@Time0: 1 total login

P1@Time5: 2 total logins

P1@Time7: 3 total logins

Then for each query, you can just query the first time stamp out of your range and subtract its login count with the current login count.

Example: You login at time = 9 and query for all logins within 4 seconds prior.

You have:

P1@Time0: 1 total login

P1@Time5: 2 total logins

P1@Time7: 3 total logins

P1@Time9: 4 total logins (just now added)

Query finds Time0 as upper bound in log(n):

P1@Time0: 1 total login <- closest time outside 4 seconds

P1@Time5: 2 total logins

P1@Time7: 3 total logins

P1@Time9: 4 total logins

Take 4 - 1 (from their login counts) = 3, you have logged in three times from your current login to 4 seconds ago (this method counts your current login as well, subtract one from your result if you don't want that).

Look up a prefix sum for more info on this.

One note about this, it assumes that any newly added entries will come later than previous ones or you have the entire set of timestamps at the beginning. Your question sounds a lot like one coming from leetcode and if that's the case they most likely would put a condition like this in there. Otherwise, the solution might not function for what you need.