I have a dictionary of counts like this:
{1:2, 2:1, 3:1}
I need to calculate q1, median, and q3 from this. It is pretty straight forward for odd numbered arrays but for even cases, I can't seem to figure it out. I want to do it without using any libraries like numpy.
Example:
counts = {
"4": 1,
"1": 2,
"5": 1
}
results = {
"q1": 1,
"median": 2.5,
"q3": 4,
}
I have something along these lines so far but this doesn't handle all cases.
def get_ratings_stats(counts):
""""This function will return min, q1, median, q3 and max value from list of ratings."""
cumulative_sum = 0
cumulative_dict = {}
for key, value in sorted(counts.items()):
cumulative_sum += value
cumulative_dict[key] = cumulative_sum
q1_index = math.floor(cumulative_sum * 0.25)
q3_index = math.ceil(cumulative_sum * 0.75)
median_index = cumulative_sum * 0.5
q1, q3, median = None, None, None
print('indexes: ', q1_index, median_index, q3_index)
for key, sum in cumulative_dict.items():
if not q1 and sum >= q1_index:
q1 = key
if not q3 and sum >= q3_index:
q3 = key
if not median and sum >= median_index:
median = key
OP's code is almost finished as it is, just problems with the final part. Different implementations are exposed and measured the different execution's times.
Timing with the following dataset
Output
Remark on the definition of quartiles: the way quartiles are implemented (as in OP) maybe not be consistent: