Core Java

Java 8 groupingBy Collector

Introduction:

GroupingBy Collectors introduced in Java 8 provides us a functionality much similar to using GROUP BY clause in a SQL statement. It helps us group objects based on certain property, returning a Map as an outcome.

In this article, we’ll look at various example usages of the groupingBy collector in Java 8. To follow along with this tutorial, we assume you have a basic understanding of Java 8 Stream API and Java 8 Stream Collectors.

GroupingBy Collector Signatures:

Let’s look at available overloaded method signatures for groupingBy():

static <T,K> Collector<T,?,Map<K,List<T>>> 
    groupingBy(Function<? super T,? extends K> classifier)

static <T,K,A,D> Collector<T,?,Map<K,D>>
    groupingBy(Function<? super T,? extends K> classifier,
                Collector<? super T,A,D> downstream)

static <T,K,D,A,M extends Map<K,D>> Collector<T,?,M>
    groupingBy(Function<? super T,? extends K> classifier,
                 Supplier<M> mapFactory,
                 Collector<? super T,A,D> downstream)

Here, we have:

  1. classifier: Classification function used for grouping the elements
  2. downstream: Collector used to perform further reduction over the values associated with a given key
  3. mapFactory: Supplier factory function to be used to create the output Map

Example Setup:

Let’s assume we have a Student object:

class Student {
    private int id;
    private String name;
    private int age;
    private String country;

    //constructors, getters, setters

    public String toString() {
        return "{"+id+","+name+","+age+","+country+"}";
    }
}

And we maintain a List<Student>:

List<Student> students = new ArrayList<>();
students.add(1, "Sam", 22, "USA");
students.add(2, "Sierra", 21, "Canada");
students.add(3, "Ross", 20, "USA");
students.add(4, "John", 22, "Canada");
students.add(5, "Hannah", 22, "Australia");

In the examples that follow, we’ll open our stream over this List of students.

Grouping By A Single Field:

Let’s first try out the easiest groupingBy() method which only accepts a classifier function. The value returned by our classifier will form a key element in the resulting Map:

Map<String, List<Student>> studentsByCountry =
  students.stream()
    .collect(Collectors.groupingBy(Student::getCountry));

If we print out studentsByCountry to the console, we’ll get:

{Canada=[{2,Sierra,21,Canada}, {4,John,22,Canada}],USA=[{1,Sam,22,USA}, {3,Ross,20,USA}],
 Australia=[{5,Hannah,22,Australia}]}

Collecting Values In A Set:

By default, all the values being mapped to a key are collected into a List<T>. What if we want them to be stored in a Set<T>?

We can achieve it by using the second overloaded method:

Map<String, Set<Student>> studentsByCountry =
 students.stream() 
   .collect(Collectors.groupingBy(Student::getCountry, Collectors.toSet()));

Getting Count Of Grouped Items:

As we now know, the second argument of groupingBy() is another Collector, so it’s pretty easy to get a count map:

//holds {Canada=2, USA=2, Australia=1}
Map<String, Long> studentsByCountryCount =
 students.stream() 
   .collect(Collectors.groupingBy(Student::getCountry, Collectors.counting()));

Averaging the Grouped Result:

What if we want to find the average student age of each country? Our code implementation would look like:

//holds {Canada=21.5, USA=21.0, Australia=22.0}
Map<String, Double> studentsByCountryAvgAge = 
  students.stream()
   .collect(Collectors.groupingBy(Student::getCountry,
             Collectors.averagingInt(Student::getAge)));

Finding Max/Min Element In Grouped Result:

Collectors.maxBy()/minBy() help us find the maximum/minimum values among the various items in each group:

/*holds {Canada=Optional[{4,John,22,Canada}], USA=Optional[{1,Sam,22,USA}],
 Australia=Optional[{5,Hannah,22,Australia}]}*/
Map<String, Optional<Student>> studentWithMaxAgePerCountry = 
  students.stream() 
    .collect(Collectors.groupingBy(Student::getCountry, 
              Collectors.maxBy(Comparator.comparingInt(Student::getAge))));

studentWithMaxAgePerCountry is a Map which holds the student with maximum age corresponding to each country. It’s important to note that an Optional<Student> is returned as a value, considering we could have an empty List as well.

On the similar lines, we can use minBy() Collector as well.

Mapping Grouped Results:

Collectors.mapping() can also be used along with groupingBy() method. Suppose we intend to find max age per country. To achieve so, we’ll write:

//holds {Canada=Optional[22], USA=Optional[22], Australia=Optional[22]}
Map<String, Optional<Integer>> maxAgePerCountry = 
  students.stream()
    .collect(Collectors.groupingBy(Student::getCountry, 
                                     Collectors.mapping(Student::getAge, 
                                        Collectors.maxBy(Integer::compare))));

mapping(mapper, downstream) collector transforms every element passed to it using a mapper function passed to it and then passes that transformed element to the downstream collector.

Grouping By Multiple Fields:

Grouping by using multiple fields grouped as a key is a more involved operation. Let’s try to group our students by country as well as age:

Map<String, Map<Integer, List<Student>>> studentsByCountryAndAge =
 students.stream()
   .collect(Collectors.groupingBy(Student::getCountry,
              Collectors.groupingBy(Student::getAge)));

If we print the above Map, we’ll have:

{Canada={21=[{2,Sierra,21,Canada}], 22=[{4,John,22,Canada}]},
 USA={20=[{3,Ross,20,USA}], 22=[{1,Sam,22,USA}]}, Australia={22=[{5,Hannah,22,Australia}]}}

Here, we have first grouped by country name and then we have further grouped them by age.

Defining Output Map Type:

It’s also possible for us to define the type of Map we would like to be produced as an outcome:

HashMap<String, List<Student>> studentsByCountry =
  students.stream()
    .collect(Collectors.groupingBy(Student::getCountry, 
              HashMap::new,
               Collectors.toList()));

Conclusion:

In this tutorial, we learned how to use Java 8 Collector – groupingBy to group our elements based on different attributes.

Be the First to comment.

Leave a Comment

Your email address will not be published. Required fields are marked *