kafak主节点cpu 内存持续飙高,不回收,最后服务挂掉问题?
kafak主节点cpu 内存持续飙高,不回收,最后服务挂掉问题?
[2018-08-23 07:56:11,788] INFO [GroupCoordinator 0]: Stabilized group consumer.web.log generation 268 (__consumer_offsets-24) (kafka.coordinator.group.GroupCoordinator)
[2018-08-23 07:56:12,054] ERROR Processor got uncaught exception. (kafka.network.Processor)
java.lang.OutOfMemoryError: Java heap space
[2018-08-23 07:56:13,846] ERROR Processor got uncaught exception. (kafka.network.Processor)
java.lang.OutOfMemoryError: Java heap space
[2018-08-23 07:56:15,673] ERROR Processor got uncaught exception. (kafka.network.Processor)
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:694)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)
at sun.nio.ch.IOUtil.read(IOUtil.java:195)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.kafka.common.network.PlaintextTransportLayer.read(PlaintextTransportLayer.java:104)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:145)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:93)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:235)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:196)
at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:557)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:495)
at org.apache.kafka.common.network.Selector.poll(Selector.java:424)
at kafka.network.Processor.poll(SocketServer.scala:628)
at kafka.network.Processor.run(SocketServer.scala:545)
at java.lang.Thread.run(Thread.java:748)
[2018-08-23 07:56:16,379] ERROR Processor got uncaught exception. (kafka.network.Processor)
java.lang.OutOfMemoryError: Java heap space
java 1772 liandong 83u IPv4 7990697 0t0 TCP *:36145 (LISTEN)
java 1772 liandong 84u IPv4 7990698 0t0 TCP *:9099 (LISTEN)
java 1772 liandong 85u IPv4 7990701 0t0 TCP *:40745 (LISTEN)
java 1772 liandong 100u IPv4 7990709 0t0 TCP prod_data_kafka_2:44688->prod_data_zk:eforward (ESTABLISHED)
java 1772 liandong 193u IPv4 7989816 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc (LISTEN)
java 1772 liandong 224u IPv4 8011001 0t0 TCP prod_data_kafka_2:9099->172.18.58.184:45220 (ESTABLISHED)
java 1772 liandong 228u IPv4 7990858 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->172.18.58.184:51326 (ESTABLISHED)
java 1772 liandong 229u IPv4 7990859 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->172.18.58.184:51334 (ESTABLISHED)
java 1772 liandong 230u IPv4 8013244 0t0 TCP prod_data_kafka_2:36145->172.18.58.184:44414 (ESTABLISHED)
java 1772 liandong 235u IPv4 7989829 0t0 TCP prod_data_kafka_2:32976->prod_data_kafka_1:XmlIpcRegSvc (ESTABLISHED)
java 1772 liandong 236u IPv4 8014128 0t0 TCP prod_data_kafka_2:36145->172.18.58.184:44484 (ESTABLISHED)
java 1772 liandong 237u IPv4 8013361 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->172.18.58.184:60202 (ESTABLISHED)
java 1772 liandong 243u IPv4 7998548 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->prod_data_kafka_3:39816 (ESTABLISHED)
java 1772 liandong 247u IPv4 7998555 0t0 TCP prod_data_kafka_2:33206->prod_data_kafka_3:XmlIpcRegSvc (ESTABLISHED)
java 1772 liandong 251u IPv4 7999481 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->prod_data_kafka_1:48914 (ESTABLISHED)
java 1772 liandong 252u IPv4 8007519 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->172.18.58.184:58936 (CLOSE_WAIT)
java 1772 liandong 253u IPv4 8012365 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->prod_data_kafka_2:33230 (CLOSE_WAIT)
java 1772 liandong 255u IPv4 8009660 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->172.18.58.184:59356 (ESTABLISHED)
java 1772 liandong 258u IPv4 8010614 0t0 TCP prod_data_kafka_2:XmlIpcRegSvc->172.18.58.184:59936 (CLOSE_WAIT)
1772 liandong 20 0 6398984 2.127g 16112 S 101.0 57.5 69:43.72 java
2018-08-22T19:01:45.014+0800: 31537.291: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.0147456 secs]
[Parallel Time: 12.9 ms, GC Workers: 2]
[GC Worker Start (ms): Min: 31537291.3, Avg: 31537291.3, Max: 31537291.3, Diff: 0.0]
[Ext Root Scanning (ms): Min: 1.8, Avg: 1.9, Max: 1.9, Diff: 0.1, Sum: 3.8]
[Update RS (ms): Min: 1.9, Avg: 1.9, Max: 1.9, Diff: 0.0, Sum: 3.9]
[Processed Buffers: Min: 14, Avg: 15.5, Max: 17, Diff: 3, Sum: 31]
[Scan RS (ms): Min: 4.5, Avg: 4.5, Max: 4.5, Diff: 0.0, Sum: 9.0]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Object Copy (ms): Min: 4.1, Avg: 4.2, Max: 4.2, Diff: 0.1, Sum: 8.3]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Termination Attempts: Min: 4, Avg: 4.0, Max: 4, Diff: 0, Sum: 8]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 12.5, Avg: 12.5, Max: 12.5, Diff: 0.0, Sum: 25.1]
[GC Worker End (ms): Min: 31537303.8, Avg: 31537303.8, Max: 31537303.8, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.3 ms]
[Other: 1.4 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.2 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.1 ms]
[Humongous Register: 0.0 ms]
[Humongous Reclaim: 0.1 ms]
[Free CSet: 0.6 ms]
[Eden: 781.0M(781.0M)->0.0B(781.0M) Survivors: 3072.0K->3072.0K Heap: 2106.0M(2347.0M)->1325.2M(2347.0M)]
[Times: user=0.03 sys=0.00, real=0.02 secs]
2018-08-22T19:01:45.029+0800: 31537.306: [GC concurrent-root-region-scan-start]
2018-08-22T19:01:45.039+0800: 31537.315: [GC concurrent-root-region-scan-end, 0.0098860 secs]
2018-08-22T19:01:45.039+0800: 31537.315: [GC concurrent-mark-start]
2018-08-22T19:01:45.111+0800: 31537.388: [GC concurrent-mark-end, 0.0721221 secs]
2018-08-22T19:01:45.111+0800: 31537.388: [GC remark 2018-08-22T19:01:45.111+0800: 31537.388: [Finalize Marking, 0.0002506 secs] 2018-08-22T19:01:45.111+0800: 31537.388: [GC ref-proc, 0.0008536 secs] 2018-08-22T19:01:45.112+0800: 31537.389: [Unloading, 0.0159521 secs], 0.0264459 secs]
[Times: user=0.05 sys=0.00, real=0.03 secs]
2018-08-22T19:01:45.139+0800: 31537.415: [GC cleanup 1339M->1339M(2347M), 0.0026152 secs]
[Times: user=0.00 sys=0.00, real=0.00 secs]
2018-08-22T19:01:48.222+0800: 31540.499: [GC pause (G1 Evacuation Pause) (young), 0.0141944 secs]
[Parallel Time: 12.6 ms, GC Workers: 2]
[GC Worker Start (ms): Min: 31540499.4, Avg: 31540499.4, Max: 31540499.4, Diff: 0.0]
[Ext Root Scanning (ms): Min: 1.4, Avg: 1.4, Max: 1.4, Diff: 0.1, Sum: 2.8]
[Update RS (ms): Min: 2.3, Avg: 2.3, Max: 2.3, Diff: 0.1, Sum: 4.6]
[Processed Buffers: Min: 11, Avg: 17.0, Max: 23, Diff: 12, Sum: 34]
[Scan RS (ms): Min: 4.4, Avg: 4.5, Max: 4.5, Diff: 0.1, Sum: 8.9]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Object Copy (ms): Min: 4.2, Avg: 4.3, Max: 4.3, Diff: 0.1, Sum: 8.5]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Termination Attempts: Min: 1, Avg: 2.5, Max: 4, Diff: 3, Sum: 5]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[GC Worker Total (ms): Min: 12.5, Avg: 12.5, Max: 12.5, Diff: 0.0, Sum: 24.9]
[GC Worker End (ms): Min: 31540511.9, Avg: 31540511.9, Max: 31540511.9, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.3 ms]
[Other: 1.3 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.2 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.1 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.1 ms]
[Free CSet: 0.6 ms]
[Eden: 781.0M(781.0M)->0.0B(780.0M) Survivors: 3072.0K->3072.0K Heap: 2106.2M(2347.0M)->1325.2M(2347.0M)]
[Times: user=0.02 sys=0.00, real=0.01 secs]
2018-08-22T19:01:51.373+0800: 31543.650: [GC pause (G1 Evacuation Pause) (young), 0.0146431 secs]
[Parallel Time: 13.1 ms, GC Workers: 2]
[GC Worker Start (ms): Min: 31543649.9, Avg: 31543649.9, Max: 31543649.9, Diff: 0.0]
[Ext Root Scanning (ms): Min: 1.4, Avg: 1.4, Max: 1.5, Diff: 0.1, Sum: 2.8]
[Update RS (ms): Min: 2.4, Avg: 2.4, Max: 2.5, Diff: 0.1, Sum: 4.8]
[Processed Buffers: Min: 8, Avg: 17.5, Max: 27, Diff: 19, Sum: 35]
[Scan RS (ms): Min: 4.5, Avg: 4.6, Max: 4.7, Diff: 0.2, Sum: 9.2]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[Object Copy (ms): Min: 4.4, Avg: 4.4, Max: 4.5, Diff: 0.1, Sum: 8.9]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 2]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 13.0, Avg: 13.0, Max: 13.0, Diff: 0.0, Sum: 25.9]
[GC Worker End (ms): Min: 31543662.8, Avg: 31543662.8, Max: 31543662.9, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.4 ms]
[Other: 1.2 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.1 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.1 ms]
[Humongous Register: 0.1 ms]
[Humongous Reclaim: 0.1 ms]
[Free CSet: 0.6 ms]
[Eden: 780.0M(780.0M)->0.0B(780.0M) Survivors: 3072.0K->3072.0K Heap: 2105.2M(2347.0M)->1325.3M(2347.0M)]
[Times: user=0.02 sys=0.00, real=0.02 secs]
XmlIpcRegSvc->172.18.58.184:60686 (CLOSE_WAIT) 有很多这个样的端口关闭等待,这是应用连接端。为什么一直等待呢?内存也没有回收 去3台服务是 2核 4G,这前给的是kafka3G,发现内存没了,报堆外内存溢出,然后我就限制kfaka最大为2G,但主节点跑一段时间后,又报堆内存溢出和堆外内存溢出,我free -m看了一下,内存确实还有100MB了,不知内存用在那里,kafka 这个进程暂用完了 80%的内存,cpu 100%多了,这是什么原因呢? 配置参数和官网核对了一下,全用默认的也不行,
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在我从kafka 1.1.0升级到kafka 2.12_2.0.0之后,堆中的内存被回收了,但可能是堆外的内存仍然在飙升,而不是下降,而是缓慢上升。原因是什么?请告诉我谢谢
更新到最新版。
Kafka本身内存使用一般,早期0.9版本的Kafka有堆外内存泄露的BUG,不知道你的版本是哪个