Which JVM dereferenced object cleanup implementation works best?

170 Views Asked by At

Assuming an object K is associated with a scarce system resource (e.g. binded to an open port on localhost UDP, of which only 65535 per machine is available). A JVM application is required to create the object, perform a task with the resource, and release it when a system GC is requested. (Obviously, defining it as an AutoClosable and use it in a withResource block can be a more efficient option, but it is not the topic of this question)

As per 2023, there are several implementation available for any JVM language (using Scala 2.13 as an example):


import org.scalatest.funspec.AnyFunSpec

import java.lang.ref.Cleaner
import scala.concurrent.{ExecutionContext, ExecutionContextExecutor, Future}
import scala.ref.{PhantomReference, ReferenceQueue, WeakReference}

class GCCleaningSpike extends AnyFunSpec {

  import GCCleaningSpike._

  describe("System.gc() can dispose unreachable object") {

    it("with finalizer") {

      var v = Dummies._1()

      assertInc {
        v = null
      }
    }

    it("<: class with finalizer") {

      var v = Dummies._2()

      assertInc {
        v = null
      }
    }

    it("registered to a cleaner") {

      @volatile var v = Dummies._3()

      assertInc {
        v = null
      }
    }

    it("registered to a phantom reference cleanup thread") {

      @volatile var v = Dummies._4()

      assertInc {
        v = null
      }
    }

    it("registered to a weak reference cleanup thread") {

      @volatile var v = Dummies._4()

      assertInc {
        v = null
      }
    }
  }
}

object GCCleaningSpike {

  implicit lazy val ec: ExecutionContextExecutor = ExecutionContext.global

  case class WithFinalizer(fn: () => Unit) {

    case class _1() {
      override def finalize(): Unit = fn()
    }

    trait _2Base {
      override def finalize(): Unit = fn()
    }
    case class _2() extends _2Base

    final val _cleaner = Cleaner.create()
    case class _3() extends AutoCloseable {

      final private val cleanable = _cleaner.register(
        this,
        { () =>
          println("\ncleaned\n")
          fn()
        }
      )

      override def close(): Unit = cleanable.clean()
    }

    lazy val phantomQueue = new ReferenceQueue[_2Base]()
    case class _4() {
      val ref = new PhantomReference(this, phantomQueue)
    }

    val cleaningPhantom: Future[Unit] = Future {
      while (true) {
        val ref = phantomQueue.remove
        fn()
      }
    }

    lazy val weakQueue = new ReferenceQueue[_2Base]()
    case class _5() {
      val ref = new WeakReference(this, weakQueue)
    }

    val cleaningWeak: Future[Unit] = Future {
      while (true) {
        val ref = weakQueue.remove
        fn()
      }
    }
  }

  @transient var count = 0

  val doInc: () => Unit = () => count += 1

  def assertInc(fn: => Unit): Unit = {
    val c1 = count

    fn

    System.gc()

    Thread.sleep(1000)
    val c2 = count

    assert(c2 - c1 == 1)
  }

  object Dummies extends WithFinalizer(doInc)
}
  • Finalizer method (illustrated by class _1()), works in the above test but was never reliable (e.g. if the JVM process was terminated), it is also deprecated in Java 11

  • Cleanable member associated with a reference cleaner (illustrated by class _3()), In the above test they cannot be triggered by dereferencing and system GC

  • WeakReference/PhantomReference member associated with an active ReferenceQueue monitoring thread (illustrated by class _4() and _5() respectively). Likewise, they also cannot be triggered by dereferencing and system GC, in addition, they both demand an expensive monitoring thread that is blocked/dormant for most of its lifespan. It is uncleared if the thread can be replaced with a low-cost green thread or coroutine.

  • Third party implementations, due to the sheer number of options, no test code is given. one notable example is com.google.common.base.internal.Finalizer

I can't perceive any of these implemntations to be perfect or even functioning. Is there a canonical, officially recommended and tested way of doing it?

UPDATE 1: I totally agree that GC mechanism should not be relied on and abused when managing system resources (e.g. ports or off-heap memory). Which is why the cleaning mechanism shown in the test suite is only used in actual cases when the resource is already bind to a scope/lifespan. And will be released regardless of GC (e.g. in case of JVM termination they will be released by system shutdown hook). Still, there is a possibility that the resource can be dereferenced and GC'ed long before its lifespan, and a slight improvement in memory footprint can be achieved. Without further clarification, let's assume that the question is about a valid use case of these features/capabilities instead of resource management

Also, the original test suite was only written in Scala for being short, I'll add the Java version later.

1

There are 1 best solutions below

2
Lae On

Assuming you have a valid reason to do so, then Cleaner is the recommend approach since Java 9, as indicated by the Javadoc on Object.finalize()

Deprecated. The finalization mechanism is inherently problematic. Finalization can lead to performance issues, deadlocks, and hangs. Errors in finalizers can lead to resource leaks; there is no way to cancel finalization if it is no longer necessary; and no ordering is specified among calls to finalize methods of different objects. Furthermore, there are no guarantees regarding the timing of finalization. The finalize method might be called on a finalizable object only after an indefinite delay, if at all. Classes whose instances hold non-heap resources should provide a method to enable explicit release of those resources, and they should also implement AutoCloseable if appropriate. The java.lang.ref.Cleaner and java.lang.ref.PhantomReference provide more flexible and efficient ways to release resources when an object becomes unreachable

For example, JDK's internal NativeBuffer uses that as a last resort to release native memory.

I'm not a Scala person, so can't comment on what is going on in your Scala code, here's an example in Java, note that System.gc() is a hint to the JVM, it does not guarantee GC will be triggered by the call, hence the while loop and some byte array allocation to increase memory pressure.

import java.lang.ref.Cleaner;

class CleanerExample {

    private static final Cleaner cleaner = Cleaner.create();

    private static volatile boolean cleaned;

    CleanerExample() {
        cleaner.register(this, () -> cleaned = true);
    }

    public static void main(String[] args) throws Exception {
        new CleanerExample();
        while (!cleaned) {
            var waste = new byte[512 * 1024 * 1024];
            System.gc();
            Thread.sleep(1000);
            System.out.println("Waiting to be cleaned...");
        }
        System.out.println("Cleaned");
    }
}