Making Java identify function composition more efficient

90 Views Asked by At

Java has java.util.function.Function.identity(T) which return a function equivalent to the lambda expression t -> t (and in fact that is the precise implementation on OpenJDK 17, which I'm looking at that the moment). This means that the returned function is guaranteed to do nothing at all to the input, and merely pass it back, as if there was no function to begin with.

So let's say I compose that function with another one using Function.andThen():

Function<String, String> fn = Function.identity();
Function<String, String> composedFn = fn.andThen(String::toUpperCase);
String foo = composedFn("foo");  //yields "FOO"

This is equivalent to the following, and I would assume (which gets me in trouble sometimes, which is why I'm asking this question) that all of the following would occur in the bytecode—the compiler would not optimize any of the lambda invocations away. (I don't know what the JRE would do at runtime after compiling, though—could it eliminate the Function.identity() altogether?)

String foo = Function.identity().apply("foo").andThen(String::toUpperCase);

Or the equivalent procedural code:

String foo = Function.identity().apply("foo").toUpperCase();

What I don't understand is why we need Function.identity() at all after composition. In other words, what if Function.identity() were implemented like this (somewhat pseudocode, ignoring irrelevant syntax details):

static <T> Function<T, T> identity() {
  return new Function() {

    @Override
    T apply(T t) {
      return t;
    }

    @Override
    <V> Function<T, V> andThen(Function<? super R, ? extends V> after) {
      return after;
    }

  }
}

The point is that if the identity function is guaranteed to act as if it did not exist in the chain), can't function composition andThen() simply return the after function itself, taking the identify function out of the composition chain altogether?

The original code would then be equivalent to the following (pseudocode):

String foo = ((Function<>)(String::toUpperCase)).apply"foo");

Or the equivalent procedural code:

String foo = "foo".toUpperCase();

Wouldn't this be more efficient? Perhaps the gained efficiency would be miniscule, but if it would provide an efficiency gain with no downsides, couldn't we improve Function.identity() in this way?

Please let me know if I'm missing some reason why this won't work, or if the Function.identity() would be 100% optimized away somehow. Otherwise it seems like something I should submit a ticket for to improve the JDK.

Here is why this is useful: there are many situations in which I may want to have optional tranformations be configured to something. By default I could simply set the transformation to null, and then do null checking to see if I wanted to add a transformation. But I'd prefer to avoid nulls altogether. It would be better to default to Function.identity() and allow transformations to be added using andThen() without needing to check for null. If the identity function were improved as I suggest, it seems that I would lose zero efficiency by defaulting to Function.identity() rather than defaulting to null and checking for null every time I add a function composition. Without this improvement, it seems I would be stuck with t -> t in the chain. It's not clear to me whether this is optimized away 100%.

1

There are 1 best solutions below

1
Irremediable On

Function.identity() is usefull when some method expect to receive mapper Function object but transformation in your use case is not needed.

public class Person {
    private String name;

    public Person(String name) {
      this.name = name;
    }

    public String getName() {
      return name;
    }
  }
  
  ....
  
  public Map<String, Person> getPersonsMap() {
    List<Person> persons = repo.findAll();
    
    return persons.stream()
        .collect(Collectors.toMap(Person::getName, Function.identity()));
  }

Due to nature of lambda functions in Java, each occurance of item -> item will lead to creating implementation class, whereas Function.identity() will not. See more here

Transformations on optional objects available throught Optional class which has methods pretty close to Stream API:

Optional.ofNullable(person.getName())
         .map(name -> name.toUpperCase())
         .orElseThrow(() -> new RuntimeException("Name is null!"));

EDIT: JLS doesn't gurantee any kind of optimization work to be made. Interesting mentions:

8.4.3.2 - An instance method is always invoked with respect to an object, which becomes the current object to which the keywords this and super refer during execution of the method body. (andThen is non-static)

8.4.3.3 - exception checks before optimizing code should be performed. (andThen can throw NullPointerException)

13.4.22 - final method modifier don't prove it can be optimized at runtime

So, it's seems that JIT will try to optimize Function.identity().andThen(String::toUpperCase) to be executed like String::toUpperCase but it is not not "optimized away" regarding ticks of CPU, since lambdas method look up still should be perfomerd together with additional checks of its body.