RSS/VSS 不断增长,直到所有内存和数据都已耗尽。交换机器端

发布于 2024-09-15 04:50:00 字数 3219 浏览 1 评论 0原文

我们在 RHEL5.3 上有一个带有 Java1.5.0.16 的 weblogic 9.2 服务器,我们在其上部署了一个 Web 服务和一个 Alfresco 内容管理系统。

我们在 HP-UX i11.23 上运行良好大约 3 年,一个月前我们迁移到 Linux RH5.3,时不时地(发生了 3 次)我们注意到该进程开始使用越来越多的资源直到机器上的所有内存和交换区结束。

该过程仍然工作正常,所有日志文件看起来都正常(好像什么也没发生一样),包括 GC 日志。

Glance for process ID 25450:

B0000A Glance C.04.70.000 06:54:05 supra2 x86_64 Current Avg High
CPU Util SU | 2% 2% 2%
Disk Util D D | 97% 97% 97%
Mem Util U U | 98% 98% 98%
Swap Util U U | 60% 60% 60%
Resources PID: 25450, java PPID: 25394 euid: 664 User:afspr04
CPU Usage (util): 5.40 Total RSS : 40.6gb
User CPU : 3.60 Text VSS : 56kb
System CPU : 1.80 Data VSS : 66.1gb
Priority : 15 Stack VSS : 2.0mb
Nice Value : 0 Total VSS : 66.5gb
Blocked On : SLEEP
Major Faults : 235
Minor Faults : 164
Processor : 1
Argv1: weblogic.Server
Cmd : /opt/java1.5.0_16/bin/java -Dweblogic.Name=dmcmsserver -Doracle.net.tns_admin=/etc -server -javaagent:/opt/MercuryDiagn
ostics/JavaAgent/DiagnosticsAgent/lib/probeagent.jar -Dprobe.id=supra2_afspr04_dmcms_ear_p4 -Dprobe.group=CMS_SERVER -D
points.file.name=/opt/MercuryDiagnostics/JavaAgent/DiagnosticsAgent/etc/supra2_afspr04_dmcms_ear_p4 -Dcom.wily.introsco
pe.agent.agentName=DMCMS -Xms7g -Xmx7g -XX:PermSize=256m -XX:MaxPermSize=256m -XX:NewSize=1792m -XX:MaxNewSize=1792m -X
X:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:+DisableExplicitGC -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xnoclassg
c -Xloggc:logs/gc.log -Doracle.net.tns_admin=/etc -Dweblogic.Stderr=/app/afspr04/dmcms_ear_p4/dmcmsdomain/logs/online.l
og -Dweblogic.Stdout=/app/afspr04/dmcms_ear_p4/dmcmsdomain/logs/online.log -Damdocs.system.home=/app/afspr04/dmcms_ear_
p4/properties/jesi -Damdocs.messageHandling.home=/app/afspr04/dmcms_ear_p4/properties/jesi -Djesi.config.loader=amdocs.
ecommerce.esi.utils.config.InterfaceConfigXPathLoader -Damdocs.uams.config.resource=config/mvc/ldap ...

pmap 将大分配显示为匿名 pmap(按大一次排序):

25450: /opt/java1.5.0_16/bin/java -Dweblogic.Name=dmcmsserver -Doracle.net.tns_admin=/etc -server -javaagent:/opt/MercuryDiagnostics/JavaAgent/DiagnosticsAgent/lib/probeagent.jar -Dprobe.id=supra2_afspr04_dmcms_ear_p4 -Dprobe.group=CMS_SERVER -Dpoints.file.name=/opt/MercuryDiagnostics/JavaAgent/DiagnosticsAgent/etc/supra2_afspr04_dmcms_ear_p4 -Dcom.wily.introscope.agent.agentName=DMCMS -Xms7g -Xmx7g -XX:PermSize=256m -XX:MaxPermSize=256m -XX:NewSize=1792m -XX:MaxNewSize=1792m -XX:SurvivorRatio=4 -XX:TargetSurvivo
00002ab0f8000000    10518548    rwx--   [anon]
00002ab798009000    8388612 rwx--   [anon]
000000005fcce000    8038976 rwx--   [anon]
00002aac7aab0000    7602176 rwx--   [anon]
00002aaf74000000    5259284 rwx--   [anon]
00002ab688000000    4194308 rwx--   [anon]
00002aae4b930000    1684124 rwx--   [anon]
00002aab80000000    1314836 rwx--   [anon]
00002aab20000000    655376  rwx--   [anon]
00002aac28000000    532488  rwx--   [anon]
00002aac50000000    524292  rwx--   [anon]
00002aaaec000000    327696  rwx--   [anon]
00002aaad8000000    131088  rwx--   [anon]
00002ab658000000    131060  rwx--   [anon]
00002ab0dc000000    131044  rwx--   [anon]
00002aaacc2f5000    114708  rwx--   [anon]
...
total 69733292K 

有人遇到过类似的事情吗?

谢谢, 盎司

We have a weblogic 9.2 server with Java1.5.0.16 on RHEL5.3 that we deploy on it a web service and an Alfresco content management system.

We were running it fine for ~3 years on HP-UX i11.23 and a month ago we moved to Linux RH5.3 and from time to time (it happened 3 times) we noticed that the process is starting to use more and more memory until all the memory and swap on the machine ends.

The process still works fine and all the log files looks normal (as if nothing happened) including GC log.

Glance for process ID 25450:

B0000A Glance C.04.70.000 06:54:05 supra2 x86_64 Current Avg High
CPU Util SU | 2% 2% 2%
Disk Util D D | 97% 97% 97%
Mem Util U U | 98% 98% 98%
Swap Util U U | 60% 60% 60%
Resources PID: 25450, java PPID: 25394 euid: 664 User:afspr04
CPU Usage (util): 5.40 Total RSS : 40.6gb
User CPU : 3.60 Text VSS : 56kb
System CPU : 1.80 Data VSS : 66.1gb
Priority : 15 Stack VSS : 2.0mb
Nice Value : 0 Total VSS : 66.5gb
Blocked On : SLEEP
Major Faults : 235
Minor Faults : 164
Processor : 1
Argv1: weblogic.Server
Cmd : /opt/java1.5.0_16/bin/java -Dweblogic.Name=dmcmsserver -Doracle.net.tns_admin=/etc -server -javaagent:/opt/MercuryDiagn
ostics/JavaAgent/DiagnosticsAgent/lib/probeagent.jar -Dprobe.id=supra2_afspr04_dmcms_ear_p4 -Dprobe.group=CMS_SERVER -D
points.file.name=/opt/MercuryDiagnostics/JavaAgent/DiagnosticsAgent/etc/supra2_afspr04_dmcms_ear_p4 -Dcom.wily.introsco
pe.agent.agentName=DMCMS -Xms7g -Xmx7g -XX:PermSize=256m -XX:MaxPermSize=256m -XX:NewSize=1792m -XX:MaxNewSize=1792m -X
X:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:+DisableExplicitGC -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Xnoclassg
c -Xloggc:logs/gc.log -Doracle.net.tns_admin=/etc -Dweblogic.Stderr=/app/afspr04/dmcms_ear_p4/dmcmsdomain/logs/online.l
og -Dweblogic.Stdout=/app/afspr04/dmcms_ear_p4/dmcmsdomain/logs/online.log -Damdocs.system.home=/app/afspr04/dmcms_ear_
p4/properties/jesi -Damdocs.messageHandling.home=/app/afspr04/dmcms_ear_p4/properties/jesi -Djesi.config.loader=amdocs.
ecommerce.esi.utils.config.InterfaceConfigXPathLoader -Damdocs.uams.config.resource=config/mvc/ldap ...

pmap shows the big allocation as anonymous
pmap (sorted by the big once):

25450: /opt/java1.5.0_16/bin/java -Dweblogic.Name=dmcmsserver -Doracle.net.tns_admin=/etc -server -javaagent:/opt/MercuryDiagnostics/JavaAgent/DiagnosticsAgent/lib/probeagent.jar -Dprobe.id=supra2_afspr04_dmcms_ear_p4 -Dprobe.group=CMS_SERVER -Dpoints.file.name=/opt/MercuryDiagnostics/JavaAgent/DiagnosticsAgent/etc/supra2_afspr04_dmcms_ear_p4 -Dcom.wily.introscope.agent.agentName=DMCMS -Xms7g -Xmx7g -XX:PermSize=256m -XX:MaxPermSize=256m -XX:NewSize=1792m -XX:MaxNewSize=1792m -XX:SurvivorRatio=4 -XX:TargetSurvivo
00002ab0f8000000    10518548    rwx--   [anon]
00002ab798009000    8388612 rwx--   [anon]
000000005fcce000    8038976 rwx--   [anon]
00002aac7aab0000    7602176 rwx--   [anon]
00002aaf74000000    5259284 rwx--   [anon]
00002ab688000000    4194308 rwx--   [anon]
00002aae4b930000    1684124 rwx--   [anon]
00002aab80000000    1314836 rwx--   [anon]
00002aab20000000    655376  rwx--   [anon]
00002aac28000000    532488  rwx--   [anon]
00002aac50000000    524292  rwx--   [anon]
00002aaaec000000    327696  rwx--   [anon]
00002aaad8000000    131088  rwx--   [anon]
00002ab658000000    131060  rwx--   [anon]
00002ab0dc000000    131044  rwx--   [anon]
00002aaacc2f5000    114708  rwx--   [anon]
...
total 69733292K 

Have anyone encountered something similar?

Thanks,
Oz

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

∞梦里开花 2024-09-22 04:50:00

您使用的服务器的 CPU/RAM 是多少?您应该查阅 WLS 9.2 的 RHEL 兼容性矩阵 并确保您的 JDK/CPU 配置是受支持的组合。另外,如果您可以选择,您可能希望将 JRockit 视为您的 JVM。最后,还尝试降低最大堆空间(-Xmx 和 -Xms),看看服务器是否更稳定。

What is the CPU/RAM of the server you are using? You should consult the RHEL compatibility matrix for WLS 9.2 and make sure your JDK/CPU configuration is a supported combination. Also, since you might want to consider JRockit as your JVM if that's an option for you. Finally, also try lowering the max heap space (-Xmx and -Xms) and see if the server is more stable.

顾挽 2024-09-22 04:50:00

我们在不同的操作系统(Sun Solaris 10 - 32 位)上遇到了同样的问题,但我看到了一个共同点:Introscope。

我们怀疑它分配了太多内存(内存泄漏?),因为它使用本机库(通过 JNI 访问 *.so)。

为了理解我的观点,在本例中我需要澄清一些关于 JVM 进程内存的事情:Java 进程的整个内存分为两个不同的部分,即本机内存和 Java 内存。

Java 部分的内存(由垃圾收集器管理的部分)可以通过标准 JVM API 进行监控。请记住,在 Java 中,您只能监视 JVM 进程的这部分内存。它包含堆(eden 和 2 个幸存者)、oldgen、permgen。这部分内存通常是最大的,这就是为什么有办法监视它,而其余部分则没有。

进程内存的其余部分(本机部分)是不同的。它由网络套接字/缓冲区、文件描述符/缓冲区、GC 实际数据结构和缓冲区、本机库缓冲区、JIT 编译器编译的本机代码以及其他一些内部 JVM 特定的东西组成。还有 JVM 和本机库的可执行代码。除了使用调试器之外,通常没有标准方法(通常根本没有方法)来查看这部分。

在向 C&A 询问 Wily / Introscope 的本机库后,他们向我们解释说:

  • 它动态分配内存;
  • 没有办法限制其内存消耗;
  • 无法预测其内存消耗;
  • Wily 仅使用它来收集底层系统的特定测量值(例如操作系统标志、CPU 负载、总可用内存、进程数...),因为 Introscope 使用 Java 代理 API 来处理其他所有事情。

对于 99% 的应用程序来说,内存的“本机”部分(非 Java 部分)与 Java 部分相比可以忽略不计。

但在这里,随着 Introscope 在我们的游戏中运行,情况会变得不同,因为本机部分可能会变得任意大,并占用进程的内存空间直至达到极限。

我们在这里得出的结论是,这些系统特定的值对我们来说并不是很有趣——我想你们中的许多人都是这种情况,因为还有其他方法可以获取它们:mem、free、top、taskmanager……——所以我们决定将其删除。简单地。

我相信这是最好的选择。

尝试一下并告诉我们它是否解决了您的记忆问题。

We have the same kind of problems here with a different OS (Sun Solaris 10 - 32bit) but I see a common point : the Introscope.

We suspected it to allocate memory too much (memory leak ?) as it uses a native library (*.so accessed through JNI).

To understand my point, there is something which I need to make clear in this case about the JVM process's memory : the whole memory of the Java process is split in two different parts, the native and the Java ones.

The memory for the Java part (the one managed by the Garbage Collector) can be monitored through standard JVM API. Just remember that, in Java, you can only monitor this part of the JVM process's memory. It contains the heap (eden & 2 survivors), oldgen, permgen. This part of the memory is usually the biggest one, this is why there are ways to monitor it, while there is none for the rest.

The rest of the process's memory, the native part, is different. It is made up of the network sockets/buffers, file descriptors/buffers, GC actual data structures and buffers, the native libraries buffers, the native code compiled by the JIT compiler, and some other inner JVM specific things. There is also the executable code of the JVM and the native libraries. There is usually no standard way (often no way at all) to look in this part, except using a debugger.

After asking to C&A about Wily / Introscope's native lib, they explained us that :

  • it allocates memory dynamically ;
  • there is no way to limit its memory consumption ;
  • there is no way to predict its memory consumption ;
  • it is used by Wily only to collect underlying system's specific measures (e.g. OS flags, CPU Load, Total free memory, number of processes, ...), as Introscope is using the Java Agent API for everything else.

For 99% of the applications, the "native" part of the memory (the non-Java part) is negligible compared with the Java part.

But here, with the Introscope playing in our game, things become different as the native part may grow arbitrarily big and eat process's memory space up to the limits.

We concluded here that those system specific values are not very interesting for us -- and I think this is the case for many of you as there are other ways to get them : mem, free, top, taskmanager, ... -- so we decided to remove it. Simply.

I believe this is the best option.

Try it and tell us if it solved your memory problems.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文