How to resolve random crashes with Cloudera libhdfs.so?

203 Views Asked by At

Getting SIGSEGV crashes in libjvm.so when using Cloudera's libhdfs.so. These seem to occur at random points given the stack traces but most commonly during Java function Monitor::wait().

Any suggestions would be greatly appreciated.

Environment:

  • CentOS 6.7
  • Linux 2.6.32-573.7.1.el6.x86_64 #1 SMP Tue Sep 22 22:00:00 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
  • GNU libc 2.12
  • java version "1.7.0_79"
  • Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
  • Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
  • Hadoop 2.6.0-cdh5.11.0

I have tried using Java 1.8 but didn't help. I don't have the option of using a newer version than CentOS 6.7.

GDB:

Program terminated with signal 11, Segmentation fault.
#0  0x00007f862f63b0f3 in Monitor::wait(bool, long, bool) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
(gdb) bt
#0  0x00007f862f63b0f3 in Monitor::wait(bool, long, bool) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
#1  0x00007f862f8022f4 in VMThread::execute(VM_Operation*) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
#2  0x00007f862f1600d8 in BiasedLocking::revoke_and_rebias(Handle, bool, Thread*) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
#3  0x00007f862f7783d2 in ObjectSynchronizer::fast_enter(Handle, BasicLock*, bool, Thread*) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
#4  0x00007f862f4520a1 in InterpreterRuntime::monitorenter(JavaThread*, BasicObjectLock*) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
#5  0x00007f8628dbaba0 in ?? ()
#6  0x00007ffe672c1290 in ?? ()
#7  0x00007f8628dbab6c in ?? ()
#8  0x00000007aeae5610 in ?? ()
#9  0x00000007aeae5610 in ?? ()
#10 0x00007ffe672c1240 in ?? ()
#11 0x0000000702434dba in ?? ()
#12 0x00007ffe672c12b8 in ?? ()
#13 0x00000007024b38b0 in ?? ()
#14 0x0000000000000000 in ?? ()
(gdb) info threads
20 Thread 0x7f8624a6a700 (LWP 26482)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
19 Thread 0x7f8624e6e700 (LWP 26417)  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
18 Thread 0x7f8626485700 (LWP 26409)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
17 Thread 0x7f8627d93700 (LWP 26403)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
16 Thread 0x7f8627e94700 (LWP 26402)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
15 Thread 0x7f8627c92700 (LWP 26404)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
14 Thread 0x7f8626788700 (LWP 26406)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
13 Thread 0x7f8624d6d700 (LWP 26418)  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
12 Thread 0x7f8626687700 (LWP 26407)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
11 Thread 0x7f86251b5700 (LWP 26414)  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
10 Thread 0x7f8626384700 (LWP 26410)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
 9 Thread 0x7f862517c700 (LWP 26416)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
 8 Thread 0x7f8626586700 (LWP 26408)  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:86
 7 Thread 0x7f8626283700 (LWP 26411)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
 6 Thread 0x7f8624b6b700 (LWP 26420)  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
 5 Thread 0x7f8626182700 (LWP 26412)  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
 4 Thread 0x7f8624c6c700 (LWP 26429)  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:239
 3 Thread 0x7f8627f95700 (LWP 26401)  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:183
 2 Thread 0x7f8626889700 (LWP 26405)  0x00007f862f707f6e in SafepointSynchronize::begin() () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so
* 1 Thread 0x7f863057e720 (LWP 26396)  0x00007f862f63b0f3 in Monitor::wait(bool, long, bool) () from /usr/java/jdk1.7.0_79/jre/lib/amd64/server/libjvm.so

JStack

----------------- 26405 -----------------
0x00007f862f707f6e  SafepointSynchronize::begin() + 0x25e
0x00007f862f80273f  VMThread::loop() + 0x1bf
0x00007f862f802bc0  VMThread::run() + 0x70
0x00007f862f679ca8  java_start(Thread*) + 0x108
----------------- 26396 -----------------
0x00007f862f63b0f3  Monitor::wait(bool, long, bool) + 0x313
0x00007f862f8022f4  VMThread::execute(VM_Operation*) + 0x324
0x00007f862f1600d8  BiasedLocking::revoke_and_rebias(Handle, bool, Thread*) + 0x178
0x00007f862f7783d2  ObjectSynchronizer::fast_enter(Handle, BasicLock*, bool, Thread*) + 0x42
0x00007f862f4520a1  InterpreterRuntime::monitorenter(JavaThread*, BasicObjectLock*) + 0xa1
0x00007f8628dbaba0  * java.lang.Thread.interrupt() bci:18 line:955 (Interpreted frame)
0x00007f8628da2058  * org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.close(boolean) bci:40 line:943 (Interpreted frame)
0x00007f8628da2058  * org.apache.hadoop.hdfs.DFSOutputStream.closeThreads(boolean) bci:5 line:2596 (Interpreted frame)
0x00007f8628da2058  * org.apache.hadoop.hdfs.DFSOutputStream.closeImpl() bci:142 line:2678 (Interpreted frame)
0x00007f8628da2058  * org.apache.hadoop.hdfs.DFSOutputStream.close() bci:28 line:2621 (Interpreted frame)
0x00007f8628da2058  * org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close() bci:11 line:72 (Interpreted frame)
0x00007f8628da2058  * org.apache.hadoop.fs.FSDataOutputStream.close() bci:4 line:106 (Interpreted frame)
0x00007f8628d9c4e7  <StubRoutines>
0x00007f862f458e95  JavaCalls::call_helper(JavaValue*, methodHandle*, JavaCallArguments*, Thread*) + 0x365
0x00007f862f4578f8  JavaCalls::call(JavaValue*, methodHandle, JavaCallArguments*, Thread*) + 0x28
0x00007f862f492384  jni_invoke_nonstatic(JNIEnv_*, JavaValue*, _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) + 0x2b4
0x00007f862f4a0ffc  jni_CallVoidMethodV + 0xec
Locked ownable synchronizers:
    - None
0

There are 0 best solutions below