Vertex pipeline model training component stuck running forever because of metadata issue

470 Views Asked by Jan Zajac At 09 September 2022 at 09:23

I'm attempting to run a Vertex pipeline (custom model training) which I was able to run successfully in a different project. As far as I'm aware, all the pieces of infrastructure (service accounts, buckets, etc.) are identical.

The error appears in a gray box in the pipeline UI when I click on the model training component and reads the following:

Retryable error reported. System is retrying.
com.google.cloud.ai.platform.common.errors.AiPlatformException: code=ABORTED, message=Specified Execution `etag`: `1662555654045` does not match server `etag`: `1662555533339`, cause=null System is retrying.

I've looked into the log explorer and found that the error logs are audit logs have the following associated tags with them:

protoPayload.methodName="google.cloud.aiplatform.internal.MetadataService.RefreshLineageSubgraph"

protoPayload.resourceName="projects/724306335858/locations/europe-west4/metadataStores/default

Leading me to think that there's an issue with the Vertex Metadatastore or the way my pipeline is using it. The audit logs are automatic though, so I'm not sure.

I've tried purging the metadata store as well as deleting it completely. I've also tried running a different model training pipeline that worked before in a different project as well but with no luck.

screenshot of ui

Original Q&A

There are 1 best solutions below

Prajna Rai T On 23 September 2022 at 14:46

Retryable error which you were getting is the temporary issue, the issue is resolved now.

You can now be able to rerun the pipeline and it is not expected to enter the infinite retry loop.

Vertex pipeline model training component stuck running forever because of metadata issue

There are 1 best solutions below

Related Questions in GOOGLE-CLOUD-VERTEX-AI

Related Questions in MLOPS

Related Questions in GCP-AI-PLATFORM-TRAINING

Related Questions in CUSTOM-TRAINING

Trending Questions

Popular # Hahtags

Popular Questions