I've got a situation where I watch a specific directory for filesystem changes. If a certain file in that directory is changed, I re-read it, attach some existing cached information, and store it in an atom.
The relevant code looks like
(def posts (atom []))
(defn load-posts! []
(swap!
posts
(fn [old]
(vec
(map #(let [raw (json/parse-string % (fn [k] (keyword (.toLowerCase k))))]
(<snip some processing of raw, including getting some pieces from old>))
(line-seq (io/reader "watched.json")))))))
;; elsewhere, inside of -main
(watch/start-watch
[{:path "resources/"
:event-types [:modify]
:callback (fn [event filename]
(when (and (= :modify event) (= "watched.json" filename))
(println "Reloading posts.json ...")
(posts/load-posts!)))}
...])
This ends up working fine locally, but when I deploy it to my server, the swap! call hangs about half-way through.
I've tried debugging it via println, which told me
- The filesystem trigger is being fired.
swap!is not running the function more than once- The watched file is being opened and parsed
- Some entries from the file are being processed, but that processing stops at entry
111(which doesn't seem to be significantly different from any preceding entries). - The update does not complete, and the old value of that
atomis therefore preserved - No filesystem events are fired after this one hangs.
I suspect that this is either a memory issue somewhere, or possibly a bug in Clojure-Watch (or the underlying FS-watching library).
Any ideas how I might go about fixing it or diagnosing it further?
The hang is caused by an error being thrown inside of the function passed as a
:callbacktowatch/start.The root cause in this case is that the modified file is being copied to the server by
scp(which is not atomic, and the first event therefore triggers before the copy is complete, which is what causes the JSON parse error to be thrown).This is exacerbated by the fact that
watch/startfails silently if its:callbackthrows any kind of error.The solutions here are
Use
rsyncto copy files. It does copy atomically but it will not generate any:modifyevents on the target file, only related temp-files. Because of the way its atomic copy works, it will only signal:createevents.Wrap the
:callbackin atry/catch, and have thecatchclause return the old value of the atom. This will causeload-posts!to run multiple times, but the last time will be on file copy completion, which should finally do the right thing.(I've done both, but either would have realistically solved the problem).
A third option would be using an FS-watching library that reports errors, such as Hawk or dirwatch (or possibly hara.io.watch? I haven't used any of these, so I can't comment).
Diagnosing this involved wrapping the
:callbackbody withto see what was actually being thrown. Once that printed a JSON parsing error, it was pretty easy to gain a theory of what was going wrong.