Error calling GET /3/Jobs h2o model training Error on large data

212 Views Asked by At

i am trying to build model on a large data(2 millions transaction data) and getting below error.Thee is no progress in model building in progress bar and after some time job stops with below error.We are running this in single node and h2o is not distributed. Please suggest is this is related to memory issue.Like If we have 20 GB training data then how much memory,heap size should be given to h2o? Does all the complete training frame stores in heap memory?

Error fetching job '$03010a010d6832d4ffffffff$_9bf0e32df1dba1c2d24eb8a513f47a4'
Error calling GET /3/Jobs/%2403010a010d6832d4ffffffff%24_9bf0e32df1dba1c2d24eb8a513f47a4
HTTP connection failure: status=error, code=503, error=Service Temporarily Unavailable

Thanks Deepti

1

There are 1 best solutions below

0
On

It is possible that an H2O cluster goes down due to out-of-memory and your client loses communication with it. You would need to review the H2O logs to determine the error/cause.

A general rule of thumb is to have about 4x memory of your dataset. See the docs. In your case, you should have about 80GB needed to handle data manipulation and modeling.