Yarn CLI application -status. Cumulative resources drop to zero for running application

13 Views Asked by At

I wrote a python script to gather the cumulative memory and v-core seconds allocated to a (spark) application.

The application polls yarn application -status <application_id> every n seconds, parses the output which is then graphed.

In a lower environment with few other jobs running I see memory and v-core seconds increase for the duration of the application.

In a higher environment with many more jobs running I see the cumulative memory and v-core seconds regularly drop to zero. When graphed both memory and vcore-seconds hover around zero with a few large spikes.

Both jobs take about the same time to run.

I can only assume that either:

  1. Polling the yarn CLI is not a reliable way to gather application stats.
  2. Something else is going on such as cumulative stats being reset if yarn withdraws resources or pauses execution. If yarn was withdrawing resources or pausing execution I would expect the job to take longer which was not the case.

Has anyone seen this before - any help would be appreciated.

0

There are 0 best solutions below