You may have encountered the classic interview question to find the top k frequent items in an array, particularly in O(N)
time, which utilizes a bucket sort.
In this post, we will explore a few ways to implement the solution. We will discuss imperative, functional, and hybrid approaches, as well as linear and logarithmic runtime complexities.
Here is a typical Java implementation that an interviewer would rejoice seeing from the candidate:
public static List<Integer> topKFrequentOrderN(int[] nums, int k) {
final var freqMap = new HashMap<Integer, Integer>();
for (int num : nums) {
freqMap.put(num, freqMap.getOrDefault(num, 0) + 1);
}
@SuppressWarnings("unchecked")
final List<Integer>[] buckets = new List[nums.length + 1];
for (int i = 0; i < buckets.length; ++i) {
buckets[i] = new ArrayList<>();
}
for (var entry : freqMap.entrySet()) {
buckets[entry.getValue()].add(entry.getKey());
}
final var result = new ArrayList<Integer>();
for (int i = buckets.length - 1; i >= 0 && result.size() < k; --i) {
for (var num : buckets[i]) {
result.add(num);
if (result.size() == k) {
break;
}
}
}
return result;
}
Here is the Scala equivalent, which is relatively succinct:
def topKFrequentOrderN(nums: Array[Int], k: Int): Array[Int] = {
val freqMap = mutable.Map[Int, Int]()
nums.foreach { num =>
freqMap.put(num, freqMap.getOrElse(num, 0) + 1)
}
val buckets = ArrayBuffer.fill(nums.length + 1)(ArrayBuffer[Int]())
freqMap.foreach { entry =>
buckets(entry._2).addOne(entry._1)
}
buckets.indices.view.reverse
.flatMap(i => buckets(i))
.take(k)
.toArray
}
Although the above imperative solution runs in O(n)
time, it can be somewhat tricky to understand, especially in Java. Let us try to implement using a functional approach.
Here is a more functional style implementation in Scala, which is pleasing to my eyes:
def topKFrequentNLogN(nums: Array[Int], k: Int): Array[Int] =
nums
.groupBy(identity)
.view
.mapValues(_.length)
.toArray // The frequency map
.sortBy(-_._2) // Instead of creating buckets, sort by the count and take the top k
.take(k)
.map(_._1)
We can implement the same in Java using the Streams API.
import java.util.*;
import java.util.stream.Collectors;
public static List<Integer> topKFrequentNLogN(int[] nums, int k) {
return Arrays.stream(nums)
.boxed()
.collect(Collectors.groupingBy(num -> num, Collectors.counting())) // Create frequency map
.entrySet()
.stream()
.sorted(Map.Entry.<Integer, Long>comparingByValue().reversed()) // Reverse sort by count
.limit(k) // Take top k
.map(Map.Entry::getKey) // Extract the keys
.collect(Collectors.toList()); // Collect to List
}
While the functional implementation is elegant, concise, and comprehensible, it has a time complexity of O(n log n)
, which may not be accepted. The interviewer is looking for linear time complexity. We do too, but are also looking for elegant code.
A hybrid approach of functional and imperative should yield some elegant code with linear time complexity.
The following is a linear-time complexity implementation using a hybrid approach in Scala. While buckets
is mutated, the rest of the code brings in the functional flavor to the mix.
def topKFrequentOrderN(nums: Array[Int], k: Int): Array[Int] = {
val buckets = Array.fill[List[Int]](nums.length + 1)(List.empty[Int])
nums
.groupBy(identity)
.view
.mapValues(_.length)
.toMap
.foreach { case t @ (num, count) =>
buckets(count) = num :: buckets(count)
}
buckets.reverse.flatten.take(k)
}
Here is the equivalent hybrid implementation in Java with linear time complexity:
public static int[] topKFrequentOrderN2(int[] nums, int k) {
@SuppressWarnings("unchecked")
List<Integer>[] buckets =
IntStream
.range(0, nums.length + 1)
.mapToObj(i -> new ArrayList<Integer>())
.toArray(List[]::new);
Arrays
.stream(nums)
.boxed()
.collect(Collectors.groupingBy(num1 -> num1, Collectors.counting())) // Frequency map
.forEach((num, count) -> buckets[count.intValue()].add(num)); // Fill the buckets
return Arrays.stream(buckets)
.flatMap(List::stream)
.limit(k)
.mapToInt(Integer::intValue)
.toArray();
}
Unfortunately, interviewers will not encourage a functional style of implementation as a solution to this problem. In an interview setting, an interviewer expects to see the mechanics of iterating over the array and other low-level concepts. However, I bet you will see the functional implementation in a typical large codebase where readability is paramount compared to the imperative style. Of course, it depends! And it’s all about trade-offs.
P.S: This post was co-authored by Morgen Peschke.