I have a Person table in database and the associated domain class looks like below:
public class Person {
private String firstName;
private String secondName;
private String city;
private String country;
private int age;
private int hits;
// accessors
}
Database table will have multiple rows for same person and I need to aggregate the hits for each using java 8.
A person is uniquely identified by following fields: firstName, secondName, city, country
I have written the below logic and it gives me the expected result
public void generateReport(List<PersonDTO> persons) {
Map<String, Integer> personHitCount =
persons
.stream()
.collect(Collectors.groupingBy(l -> getPersonKey(l), Collectors.summingInt(Person::getHits)));
List<Person> reportRecords =
persons
.stream().collect(Collectors.toMap(l -> getPersonKey(l), Function.identity(), (o1, o2) -> o1))
.entrySet()
.stream()
.filter(e -> personHitCount.containsKey(e.getKey()))
.map(e -> transform(e.getValue(), personHitCount.get(e.getKey())))
.collect(Collectors.toList());
reportRecords.stream().forEach(System.out::println);
}
private Person transform(Person personDTO, int count) {
return new Person(
personDTO.getFirstName(),
personDTO.getSecondName(),
personDTO.getCity(),
personDTO.getCountry(),
personDTO.getAge(),
count);
}
private String getPersonKey(Person person) {
return new StringJoiner("-")
.add(person.getFirstName())
.add(person.getSecondName())
.add(person.getCity())
.add(person.getCountry())
.toString();
}
I am not sure, if this is a good and performant way of doing it, as I am looping twice over persons list. Please can you suggest any improvements to this code or a better way of doing it.
Solution with Double Iteration and Same Return Type
You could improve your stream by removing some unnecessary operations, like converting a DTO into a
Stringand filtering (you're filtering for the keys coming from the very map you're streaming...). That would give you a more compact and readable stream.As @daniu suggested in the comment section, a better approach would be to get the total number of hits from SQL, as the language provides aggregate functions for this purpose. However, I assume that perhaps you didn't go this way because your SQL query would look something like this
SELECT firstName, secondName, city, country, SUM(hits) FROM ...leaving behind other fields that are not necessary for the group by but that you need to construct aPersoninstance (this is just a guess).In this solution, I assume that your
PersonDTOclass honors the equals and hashcode contract by providing an implementation based on the attributes:firstName,secondName,cityandcountry.Solution with Single Iteration and Different Return Type
I'm updating my answer to provide a second solution with a more substantial improvement where the elements are iterated only once. However, it requires changing the method's return type from
List<Person>toCollection<Person>(I know that the method was void, but I assume it was printing the results instead of returning just for demonstration’s sake).Changing the return type from
List<Person>toCollection<Person>would imply generalizing to a higher interface in the class hierarchy, asCollectionis a parent ofList, which might not be doable if you've already agreed on a specific contract/interface with other developers. This is why I'm posting this second solution as an update and not as the actual answer.In short, the total numbers of hits are computed using
Personinstances, instead of Integers. Basically, each DTO is used as a key, and whenever two DTOs collide (they're equal) the hits of their correspondingPersonare combined, keeping only the resultingPersoninstance.Demo Version
Here is a demo version at OneCompiler. In the demo, I've assumed that both
PersonandPersonDTOpossess the same set of attributes, since there is no implementation of the DTO class attached. I've also updated the demo with the second solution.