In this quick tutorial, we’ll look at ways in which we can create a word frequency map in Java 8.
Let’s say we have been given a list of names:
List<String> names = {"Sam", "James", "Selena", "James", "Joe", "Sam", "James"};
We wish to print the frequency map specifying the frequency count of each name in the list:
{Joe=1, James=3, Selena=1, Sam=2}
Java 8 Streams helps us to come up with an easy and pretty straight-forward solution to this problem. Our code would look similar to:
Map<String, Long> frequencyMap = names.stream() .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
The idea here is to:
If you’re new to Java 8 Stream, we’ll recommend you check out our article on Java 8 Streams.
Also, if we specifically intend to create a Map<String, Integer> i.e having counts stored as an Integer, we can use:
Map<String, Integer> frequencyMap = names.stream() .collect(Collectors.groupingBy(Function.identity(), Collectors.summingInt(val -> 1)));
Collectors.summingInt() is responsible for summing integer values using a given mapper function. Our mapper function is simply incrementing the value by 1 for each occurrence.
Let’s further improve over our solution and try to create a frequency map which ignores word casing. So, for a list of names:
List<String> names = {"Sam", "james", "Selena", "JAMes", "Joe", "sam", "JamES"};
Our solution should now ignore letter-casing and return:
{joe=1, selena=1, james=3, sam=2}
We can achieve it by using a very simple tweak to the above solution. It involves transforming each word in the stream to its lowercase version before performing a grouping operation:
Map<String, Long> frequencyMap = names.stream() .map(String::toLowerCase) .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
We can further tweak the solution based on our specific requirement.
In this mini-tutorial, we learned how to create a word frequency map in Java 8.
Prior to Java 8, writing a method for calculating a word frequency count usually required us to write around 5-7 lines of code. The idea was to insert new keys into the map and keep on incrementing the counters for any word repetitions.
Java 8 Streams API made this solution a lot more sophisticated and hardly a one-liner. Hope this helps us realize the beauty and power of Java 8 Streams API.