Java arbitrary comparator on Object

240 Views Asked by At

Let's say I am building a TreeSet of an object for which the ordering depends on only one value.

I cannot do

TreeSet<Foo> tree = new TreeSet<>(Comparator.comparingInt(Foo::getX));

because if I add two different Foo objects with the same x then one would replace the other (i.e. if I do tree.add(foo1) and tree.add(foo2), tree.size() will be 1 instead of 2).

I could compare every field of Foo, but I would like two instances of Foo to be considered distinct, even if every field is the same.

One solution that "almost works" would be

TreeSet<Foo> tree = new TreeSet<>(Comparator.comparingInt(Foo::getX).thenComparing(Foo::hashCode));

But this would fail when there is a hash collision.

In summary, I am looking to something similar to

TreeSet<Foo> tree = new TreeSet<>(Comparator.comparingInt(Foo::getX).thenComparing(Foo::getInternalAddress));

But of course we don't have access to such method.

Workarounds

I know there are workarounds:

  • if I don't care about the objects themselves but just how many are in the tree, I can use a multiset TreeMap<Foo, Integer> (and comparing all the fields) that give me how many Foos for a specific x
  • if I do care about the objects (I am making reference equality checks) I can use a different multiset TreeMap<Foo, List<Foo>> (or TreeMap<Integer, List<Foo>> with the key being x). But if there are very few "duplicated" foos, this is a waste of space taken by all the singleton lists.

So while I know there are workarounds with a TreeMap, I still wonder if there is a way to do that with just a TreeSet.

2

There are 2 best solutions below

2
Joachim Sauer On BEST ANSWER

Guava provides Ordering.arbitrary() which you can use to construct your desired Comparator:

Comparator.comparingInt(Foo::getX).thenComparing(Ordering.arbitrary())

Ordering.arbitrary() internally uses System.identityHashCode which almost solves your "order arbitrary, but consider identity" problem, except there's no guarantee that identityHashCode will never collide (trivially, since it's a 32bit value and a Java project can easily create more than 232 objects, given enough memory).

Ordering.arbitrary() uses some neat tricks to still sort objects with colliding identityHashCode values in a consistent (but arbitrary!) way, see its code.

5
Ricola On

With an external library

See @Joachim Sauer's answer and @Stuart Marks's comment : Guava's Ordering.arbitrary().

Without external libraries

Add an id field to Foo that is unique for each instance.

With UUID class

Suggested by @Sweeper in a comment.

class Foo {
  private UUID uuid = UUID.randomUUID();
...
}
...
new TreeSet<>(Comparator.comparingInt(Foo::getX).thenComparing(Foo::getUUID));

The chance of collisions are almost inexistent (https://stackoverflow.com/a/24876263/7424948), but if you are paranoid you can use the custom instance counter below.

Instance counter

class Foo {
  // if only one thread, otherwise use AtomicInteger for thread-safe.
  // assuming we'll get less than Integer.MAX_VALUE instances, otherwise use long.
  private static int counter = 0;
  private int id;

  public Foo(...){
    id = counter++;
  } 
...
}
...
new TreeSet<>(Comparator.comparingInt(Foo::getX).thenComparingInt(Foo::getId));

Advantage over UUID is that it takes less space (32 bits per instance instead of 128 bits) and that there is less computation to get the id.