T사에서 WebLogic JVM Crash로 인하여 인스턴스가 갑자기 죽어버리는 현상이 발생했다.
인프라 정보는 아래와 같다.
===============================
SunOS 5.11 11.3 sun4v sparc sun4v
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
Weblogic Version 12.2.1.2.0
===============================
* 서버가 죽으면서 생성된 hs_err_pid.log 파일 분석 내용
#1. hs_err_pid.log
# A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0xffffffff7ec82444, pid=31071, tid=0x0000000000000078 # # JRE version: Java(TM) SE Runtime Environment (8.0_121-b13) (build 1.8.0_121-b13) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.121-b13 mixed mode solaris-sparc compressed oops) # Problematic frame: # C [libc.so.1+0x182444] memcpy%sun4v-hwcap3+0x990 # # Core dump written. Default location: /tp/wl_domains/tp_domain_12c/core or core.31071 # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. |
> SIGSEGV : Segmentation Fauilt로 “유효하지 않은 메모리 액세스”로 인하여 OS에서 프로세스가 강제로 종료된 것으로 판단 된다.
또한 hs_err 에서 다음 stack에서 crash가 발생한 것으로 확인된다.
Stack: [0xffffffff26a00000,0xffffffff26b00000], sp=0xffffffff26afb4c0, free space=1005k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libc.so.1+0x182444] memcpy%sun4v-hwcap3+0x990 C [libzip.so+0x15eac] ZIP_GetEntry2+0x100 C [libzip.so+0x358c] Java_java_util_zip_ZipFile_getEntry+0x94 J 105 java.util.zip.ZipFile.getEntry(J[BZ)J (0 bytes) @ 0xffffffff69fe0724 [0xffffffff69fe05e0+0x144] J 8970 C2 weblogic.utils.classloaders.ZipClassFinder.getSource(Ljava/lang/String;)Lweblogic/utils/classloaders/Source; (72 bytes) @ 0xffffffff6bcb995c [0xffffffff6bcb9880+0xdc] Relevance to the Issue |
> 해당 stack 관련하여 오라클 문서 확인결과
- Java Errors at ZIP_GetEntry ( Doc ID 2394115.1 )
Cause
This is a known issue, caused by application code that fails to synchronize access to a jar file. Most crashes in ZIP_GetEntry occur when a jar file being accessed is modified or overwritten while the JVM instance is running. Because the HotSpot JVM maps each jar file's central directory structure into memory using mmap, when a jar file is modified or overwritten on the disk, the JVM's copy of the data becomes inconsistent with the jar file on the disk. Any subsequent attempt to read and load entries from the modified jar file can result in this type of crash. See Oracle blog post "Crashes in ZIP_GetEntry" for details.
Solution
A few solution options:
1. The most effective solution is to improve the application code to avoid modifying jar files while they are in use.
2. Upgrade to Java SE 9 or later versions.
An enhancement is included in Java SE 9 and later versions1, which uses a ZipFile implementation that does not store the jar file in memory, thereby avoiding the problem that causes most of these types of crashes. See JDK-8145260 - To bring j.u.z.ZipFile's native implementation to Java to remove the expensive jni cost and mmap crash risk.
3. Alternatively, use the following workaround:
Add the -Dsun.zip.disableMemoryMapping=true system property to your 'java' command line arguments.
This flag disables the mmap of the jar file, allowing you to avoid most crashes caused by buggy application code that erroneously modifies jar files while they are in use.
==========================================================================
2. 원인 및 해결방법
jar 파일이 운영중에 변경될때 주로 발생한다고 언급되어 있다.
관련 fix는 JDK 9이상에서 포함되었고 JDK8에서는 포함되지 않아서 workaround를 적용해야 하며
문서에 언급된것과 같이 다음 JVM option을 추가하면 된다.
-Dsun.zip.disableMemoryMapping=true
'트러블슈팅 > MW' 카테고리의 다른 글
[OHS] OPMN Ping failed Error 이슈 (0) | 2022.05.06 |
---|---|
[OHS] HTTP 400 Error. Bad Request 사례 (0) | 2021.10.22 |
[웹로직] IBM AIX WebLogic Starting Slowly hang or STUCK at getLocalHostName (0) | 2021.09.30 |
[웹로직] OutOfMemoryError: Metaspace에 대한 고찰 (0) | 2021.08.24 |
[OHS] 12.2.1.4 설치 시에 compat 패키지 이슈 (0) | 2021.08.23 |