I am using [org.clojure/clojure "1.10.1"],[org.clojure/core.async "1.2.603"] and the latest Amazon Corretto 11 JVM if there was anything to do with them.
The following code is a simplified version of the code used in production and it does cause memory leak. I have no idea why that happened but suspect it might due to sub/unsub of channels. Can anyone help point out where my code may go wrong or how I can fix the memory leak?
(ns test-gc.core
(:require [clojure.core.async :as a :refer [chan put! close! <! go >! go-loop timeout]])
(:import [java.util UUID]))
(def global-msg-ch (chan (a/sliding-buffer 200)))
(def global-msg-pub (a/pub global-msg-ch :id))
(defn io-promise []
(let [id (UUID/randomUUID)
ch (chan)]
(a/sub global-msg-pub id ch)
[id (go
(let [x (<! ch)]
(a/unsub global-msg-pub id ch)
(:data x)))]))
(defn -main []
(go-loop []
(<! (timeout 1))
(let [[pid pch] (io-promise)
cmd {:id pid
:data (rand-int 1E5)}]
(>! global-msg-ch cmd)
(println (<! pch)))
(recur))
(while true
(Thread/yield)))
A quick heap dump gives the following statistics for example:
Class by number of instances
java.util.LinkedList5,157,128 (14.4%)java.util.concurrent.atomic.AtomicReference3,698,382 (10.3%)clojure.lang.Atom3,094,279 (8.6%)- ...
Class by size of instances
java.lang.Object[]210,061,752 B (13.8%)java.util.LinkedList206,285,120 B (13.6%)clojure.lang.Atom148,525,392 B (9.8%)clojure.core.async.impl.channels.ManyToManyChannel132,022,336 B (8.7%)- ...
I finally figured out why. By looking at the source code, we get the following segment:
We can see
multsstores alltopichence shall increase monotonically if we do not clear it. We may add something like(a/unsub-all* global-msg-pub pid)to fix that.