I am seeing the below error when I submit an Oozie job in an EMR Hadoop cluster. I could see that a certain container is not finding the temporary output file that is produced by another job (?).
I verified the name node address, and also verified there is enough memory on it (attached additional EBS volume). I am using Master instance type - m5.2xlarge and core instance type - r5a.2xlarge.
This is the error I see in the Application Master logs.
Application application_1632399471753_0051 failed 2 times due to AM Container for appattempt_1632399471753_0051_000002 exited with exitCode: -1000
Failing this attempt.Diagnostics: java.io.FileNotFoundException: File does not exist: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/crunch-1324412807/p2/MAP
For more detailed output, check the application tracking page: http://ip-10-23-31-119.us-west-2.compute.internal:8088/cluster/app/application_1632399471753_0051 Then click on links to logs of each attempt.
. Failing the application.
This is the error I see when I go into one of the task logs.
2021-09-23 13:19:43,280 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1632399471753_0051Job Transitioned from INITED to SETUP
2021-09-23 13:19:43,282 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP
2021-09-23 13:19:43,294 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1632399471753_0051Job Transitioned from SETUP to RUNNING
2021-09-23 13:19:43,309 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1632399471753_0051_m_000000 Task Transitioned from NEW to SCHEDULED
2021-09-23 13:19:43,311 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1632399471753_0051_m_000000_0 TaskAttempt Transitioned from NEW to UNASSIGNED
2021-09-23 13:19:43,311 INFO [Thread-85] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: mapResourceRequest:<memory:7168, vCores:1>
2021-09-23 13:19:43,380 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1632399471753_0051, File: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/hadoop-yarn/staging/hadoop/.staging/job_1632399471753_0051/job_1632399471753_0051_1.jhist
2021-09-23 13:19:44,270 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:0 ContRel:0 HostLocal:0 RackLocal:0
2021-09-23 13:19:44,298 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for application_1632399471753_0051: ask=1 release= 0 newContainers=0 finishedContainers=0 resourcelimit=<memory:14336, vCores:10> knownNMs=2
2021-09-23 13:19:45,306 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Got allocated containers 1
2021-09-23 13:19:45,308 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Assigned container container_1632399471753_0051_01_000002 to attempt_1632399471753_0051_m_000000_0
2021-09-23 13:19:45,309 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:0 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2021-09-23 13:19:45,362 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-jar file on the remote FS is hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/hadoop-yarn/staging/hadoop/.staging/job_1632399471753_0051/job.jar
2021-09-23 13:19:45,364 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: The job-conf file on the remote FS is /tmp/hadoop-yarn/staging/hadoop/.staging/job_1632399471753_0051/job.xml
2021-09-23 13:19:45,367 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/crunch-1324412807/p2/MAP
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:902)
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createContainerLaunchContext(TaskAttemptImpl.java:947)
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1714)
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl$ContainerAssignedTransition.transition(TaskAttemptImpl.java:1691)
at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1210)
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:147)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1459)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:1451)
at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://ip-10-23-31-119.us-west-2.compute.internal:8020/tmp/crunch-1324412807/p2/MAP
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1444)
at org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1437)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1452)
at org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:774)
at org.apache.hadoop.mapreduce.v2.util.MRApps.parseDistributedCacheArtifacts(MRApps.java:601)
at org.apache.hadoop.mapreduce.v2.util.MRApps.setupDistributedCache(MRApps.java:491)
at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.createCommonContainerLaunchContext(TaskAttemptImpl.java:821)
... 14 more
2021-09-23 13:19:45,369 INFO [AsyncDispatcher ShutDown handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye..
End of LogType:syslog
I am not sure what other things I need to check and troubleshoot this issue. I don not see any logs being printed from my application code.