top-k-frequent-words

692. Top K Frequent Words

Given an array of strings words and an integer k, return the k most frequent strings.

Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.

Example 1:

Input: words = ["i","love","leetcode","i","love","coding"], k = 2 Output: ["i","love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4 Output: ["the","is","sunny","day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.

Constraints:

1 <= words.length <= 500
1 <= words[i].length <= 10
words[i] consists of lowercase English letters.
k is in the range [1, The number of **unique** words[i]]

Follow-up: Could you solve it in O(n log(k)) time and O(n) extra space?

class Solution {
public:
    vector<string> topKFrequent(vector<string>& words, int k) {
        // res 最多就 k 的長度
        vector<string> res(k);

        // 計算每個單字的頻率
        unordered_map<string, int> freq;
        for (auto word : words) ++freq[word];

        auto cmp = [](pair<string, int>& a, pair<string, int>& b) {
            // a.second > b.second 比較頻率
            // priority_queue 和 sort custom compare 的順序相反
            // 如果頻率是由小排到大的話，會是 a.second > b.second

            // 如果出現頻率一樣的話，a.second == b.second && a.first < b.first
            // 比較字串
            return a.second > b.second || (a.second == b.second && a.first < b.first);
        };
        priority_queue<pair<string, int>, vector<pair<string, int>>, decltype(cmp)> pq(cmp);

        for (auto f : freq) {
            pq.push(f);
            // 維持 k 的長度，如果太長就 pop 掉，從頻率少的開始 popup
            if (pq.size() > k) pq.pop();
        }

        // 因為這時候的 priority queue 是從小排到大
        // 所以從 res 的最後面開始裝
        for (int i = res.size() - 1; i >= 0; --i) {
            res[i] = pq.top().first; pq.pop();
        }
        return res;
    }
};

T: $O(N)$
S: $O(N)$

692. Top K Frequent Words​

692. Top K Frequent Words