I was trying to solve Top K Frequent Words, specifically in O(n Log k) time. Apparently the solution is:
from collections import Counter
from heapq import heappush, heappop
class Pair:
def __init__(self, word, freq):
self.word = word
self.freq = freq
def __lt__(self, p):
return self.freq < p.freq or (self.freq == p.freq and self.word > p.word)
class Solution:
def topKFrequent(self, words: List[str], k: int) -> List[str]:
cnt = Counter(words)
h = []
for word, freq in cnt.items():
heappush(h, Pair(word, freq))
if len(h) > k:
heappop(h)
return [p.word for p in sorted(h, reverse=True)]
I understand most of what's going on here. The only thing that I can't figure out is the __lt__ method. I understand that we are overriding the __lt__ method, but how is it being implemented here? How is it being invoked?
My best guess is most of the invocation is done "under the cover". That is, when new Pair objects are pushed onto the heap, __lt__ will be invoked to compare frequencies and since it returns a boolean, depending on whether the return value is True or False, the heap will adjust accordingly?
But then, is __lt__ being invoked again in the last return statement when we use sorted()?
The problem requires that we return the top K words with the highest frequencies and if there is a tie (words have the same frequency), then sort those words lexicographically. I'm trying to understand where the sorting is exactly done? It seems like both the frequency sorting and lexicographically sorting is done within the __lt__ method?
I walked through the code and used the debugger to see the order of execution of the code. I can't seem to figure out how/when exactly the magic method (__lt__) comes into play and when the lexicographical sorting is take care of.
I've found some previous threads that discuss this very problem but could not find an explanation/answer that sheds some light on the __lt__ method.
ltis lowercaseLTand notIt. lt stands for "less than". the__lt__method is invoked whenever a<comparison is made. For example,x < ywill callx.__lt__(y)under the hood. I am not totally sure, but my guess is that thesorted()function uses__lt__to compare and order items in a list.