I Just want to understand how we should plan for the capacity of a NiFi instance.
We have a NiFi instance which is having around 500 flows. So, the total number of processors enabled on NiFi canvas is around 4000. We do run 2-5 flows simultaneously which does not take more than half an hour i.e. we do process data in MBs.
It was working fine till now but we are seeing outofMemory error very often. So we increased xms and xmx parameters from 4g to 8g which has resolved the problem for now. But going forward we will have more flows and we may face outofmemory issue again.
So, can anyone help with matrix of capacity planning or any suggestion to avoid such issues before happening? eg:- If we have 3000 processors enabled with/without any processing then Xg amount memory required.
Any input on NiFi capacity planning would be appreciated.
Thanks in Advance.
OOM errors can occur due to specific memory consuming processors. For example: SplitXML is loading your whole record to memory, so it could load a 1GiB file for instance.
Each processors can document what resource considerations should be taken. All of the Apache processors(as far as I can tell) are documented in that matter so you can rely on them.
In our example, by the way,
SplitXMLcan be replaced with SplitRecord which doesn't load all of the record to memory.So even if you use 1000 processors simultaneously, they might not consume as much memory as one processor that loads your whole FlowFile's content to memory.
Check which processors you are using and make sure you don't use one like that(there are more like this one that load the whole document to memory).