AIX 内存让谁给吃了,高手都来看看啊
本帖最后由 javaio 于 2010-09-27 22:11 编辑
AIX P550 16CPU 24G内存,上面只运行了WEBLOGIC,无其他应用,内存使用非常高,把WEBLOGIC全停掉后,内存只释放了一点点,问内存都让谁吃了? faults为何那么高?paging space 没停WEBLOGIC前高达25% 为何?
这里是没重启机器前的状态:
# oslevel
5.3.0.0
# oslevel -r
5300-09
#
Topas Monitor for host: p550a EVENTS/QUEUES FILE/TTY
Mon Sep 27 16:30:30 2010 Interval: 2 Cswitch 239 Readch 2752
Syscall 1501 Writech 583
CPU User% Kern% Wait% Idle% Reads 1 Rawin 0
ALL 0.0 7.5 0.0 92.5 Writes 3 Ttyout 583
Forks 0 Igets 0
Network KBPS I-Pack O-Pack KB-In KB-Out Execs 0 Namei 1
en4 1.4 5.5 6.0 0.3 1.1 Runqueue 0.5 Dirblk 0
lo0 0.7 7.0 7.0 0.3 0.3 Waitqueue 0.0
Disk Busy% KBPS TPS KB-Read KB-Writ PAGING MEMORY
hdisk0 0.0 0.0 0.0 0.0 0.0 Faults 15600 Real,MB 23552
hdisk1 0.0 0.0 0.0 0.0 0.0 Steals 0 % Comp 80.0
cd0 0.0 0.0 0.0 0.0 0.0 PgspIn 0 % Noncomp 2.2
PgspOut 0 % Client 2.2
Name PID CPU% PgSp Owner PageIn 0
xmwlm 90582 8.2 1.0 root PageOut 0 PAGING SPACE
dtgreet 50060 0.0 1.4 root Sios 0 Size,MB 24576
java 94470 0.0 347.9 weblogic % Used 5.5topas 110948 0.0 1.5 root NFS (calls/sec) % Free 95.5
topas 135850 0.0 1.5 root ServerV2 0
topas 86318 0.0 1.9 weblogic ClientV2 0 Press:
getty 159942 0.0 0.6 root ServerV3 0 "h" for help
xmgc 49176 0.0 0.4 root ClientV3 0 "q" to quit
swapper 4654 0.0 0.4 root
swapper 4386 0.0 0.4 root
swapper 4922 0.0 0.4 root
rpc.lock 49622 0.0 1.2 root
syncd 45720 0.0 0.5 root
lrud 16392 0.0 0.8 root
netm 24900 0.0 0.4 root
psmd 24588 0.0 0.8 root
gil 25158 0.0 0.9 root
X 57988 0.0 3.6 root
init 1 0.0 0.6 root
IBM.CSMA 37810 0.0 3.9 root
# svmon
size inuse free pin virtual
memory 6029312 5006707 1022605 4732781 5141836
pg space 6291456 338240
work pers clnt other
pin 4512979 0 0 219802
in use 4880770 5 125932
PageSize PoolSize inuse pgsp pin virtual
s 4 KB - 4825651 336864 4576957 4959964
m 64 KB - 11316 86 9739 11367
# vmstat -v
6029312 memory pages
5734502 lruable pages
1020951 free pages
4 memory pools
4734381 pinned pages
80.0 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
2.1 numperm percentage
123913 file pages
0.0 compressed percentage
0 compressed pages
2.1 numclient percentage
80.0 maxclient percentage
123908 client pages
0 remote pageouts scheduled
25 pending disk I/Os blocked with no pbuf
40499386 paging space I/Os blocked with no psbuf
2228 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
37620 external pager filesystem I/Os blocked with no fsbuf
0 Virtualized Partition Memory Page Faults
0.00 Time resolving virtualized partition memory page faults
#
# sar -r 5
AIX p550a 3 5 00CF30D34C00 09/27/10
System configuration: lcpu=16 mem=23552MB mode=Capped
16:44:12 slots cycle/s fault/s odio/s
16:44:17 5953205 0.00 22224.13 0.60
#
# vmstat 1 10
System configuration: lcpu=16 mem=23552MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
1 0 5142491 1021874 0 0 0 0 0 0 8 1713 236 0 4 96 0
0 0 5142491 1021874 0 0 0 0 0 0 4 1371 245 0 6 94 0
0 0 5143419 1020946 0 0 0 0 0 0 7 1550 235 0 3 97 0
1 0 5142490 1021875 0 0 0 0 0 0 7 1381 237 0 3 97 0
0 0 5142490 1021875 0 0 0 0 0 0 6 1553 236 0 6 94 0
0 0 5143834 1020531 0 0 0 0 0 0 5 1375 233 0 4 96 0
1 0 5142490 1021875 0 0 0 0 0 0 6 1546 232 0 2 98 0
0 0 5142490 1021875 0 0 0 0 0 0 3 1374 237 0 6 94 0
0 0 5143402 1020963 0 0 0 0 0 0 9 1559 241 0 4 96 0
1 0 5142490 1021875 0 0 0 0 0 0 8 1371 233 0 2 98 0
#
# vmo -a
cpu_scale_memp = 8
data_stagger_interval = 161
defps = 1
force_relalias_lite = 0
framesets = 2
htabscale = n/a
kernel_heap_psize = 4096
kernel_psize = 16777216
large_page_heap_size = 0
lgpg_regions = 0
lgpg_size = 0
low_ps_handling = 1
lru_file_repage = 1
lru_poll_interval = 10
lrubucket = 131072
maxclient% = 80
maxfree = 1088
maxperm = 4587600
maxperm% = 80
maxpin = 4867414
maxpin% = 80
mbuf_heap_psize = 65536
memory_affinity = 1
memory_frames = 6029312
memplace_data = 2
memplace_mapped_file = 2
memplace_shm_anonymous = 2
memplace_shm_named = 2
memplace_stack = 2
memplace_text = 2
memplace_unmapped_file = 2
mempools = 4
minfree = 960
minperm = 1146898
minperm% = 20
nokilluid = 0
npskill = 49152
npsrpgmax = 393216
npsrpgmin = 294912
npsscrubmax = 393216
npsscrubmin = 294912
npswarn = 196608
num_spec_dataseg = 0
numpsblks = 6291456
page_steal_method = 0
pagecoloring = n/a
pinnable_frames = 1295373
psm_timeout_interval = 5000
pta_balance_threshold = n/a
relalias_percentage = 0
rpgclean = 0
rpgcontrol = 2
scrub = 0
scrubclean = 0
soft_min_lgpgs_vmpool = 0
spec_dataseg_int = 512
strict_maxclient = 1
strict_maxperm = 0
v_pinshm = 0
vm_modlist_threshold = -1
vmm_fork_policy = 1
vmm_mpsize_support = 1
wlm_memlimit_nonpg = 1
#
# vmstat -s
60125920853 total address trans. faults
236299273 page ins
255456576 page outs
227100146 paging space page ins
224903271 paging space page outs 0 total reclaims
59047865001 zero filled pages faults
47621021 executable filled pages faults
3140190022 pages examined by clock
2096 revolutions of the clock hand
250788432 pages freed by the clock
10091109 backtracks
93664 free frame waits
0 extend XPT waits
230485022 pending I/O waits
488710726 start I/Os
457139720 iodones
5300839206 cpu context switches
619179944 device interrupts
1623206554 software interrupts
6878573570 decrementer interrupts
122106361 mpc-sent interrupts
122106357 mpc-receive interrupts
5386771 phantom interrupts
0 traps
25194445202 syscalls
#
# svmon -Pt15 | perl -e 'while(<>){print if($.==2||$&&&!$s++);$.=0 if(/^-+$/)}'
-------------------------------------------------------------------------------
Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd 16MB
94470 java 174205 65649 42244 164564 N Y N
86318 topas 75653 65539 112 75651 N N N
135852 topas 75541 65539 112 75529 N N N
37810 IBM.CSMAgentR 75307 65552 995 76188 N Y N
131400 ksh 75270 65539 112 75318 N N N
119138 ksh 75270 65539 112 75318 N N N
168664 ksh 75270 65539 112 75318 N N N
159942 getty 75269 65539 229 75293 N N N
127190 getty 75267 65539 230 75291 N N N
53668 sendmail 75240 65539 284 75405 N N N
143812 ksh 75232 65539 112 75279 N N N
152148 ksh 75232 65539 112 75279 N N N
143362 telnetd 75219 65539 113 75309 N N N
69798 telnetd 75219 65539 113 75309 N N N
110786 telnetd 75219 65539 113 75309 N N N
#
机器重启后,TOPAS里Comp内存占用11%左右,全正常;再启动WEBLOGIC共4个服务总占用45%左右,也全正常,问为何重启前后变化如此大?内存都让谁吃了?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
重启了,啥也没有了,由他去吧
weblogic后台有报错信息,不过那些都是应用程序本身有问提,java里抛出的堆栈错误信息。不过确实weblogic本身没有吃多少内存,总共也就30%左右。把所有weblogic停掉后,内存不自己释放,依旧维持在80%,重启后完全正常,我就是没找到这么多内存上那里去了?
还有一个大问题就是你仔细查看我重启机器前后vmstat 里输出的avm相差很远,重启前它好像是把我物理内存+交换空间放在了avm,重启之后呢,avm只显示实际物理内存的大小。
问题只是大家讨论而已,有人说是内存泄漏了,但是我没找到泄漏的证据。我相信坛子上高手如云,肯定会有个合理的解释的,大家不妨都来讨论下,交流是为进步嘛
weblogic没有抛出OOM错误吗
我同意你的观点,可是问题是机器重启以前已经慢了,反映迟钝,即使把所有应用停了内存占用还是高达80%,然后逼不得以重启再启动应用全正常,现在就是想分析下原来的近80%内存都跑那里去了
本帖最后由 javaio 于 2010-09-28 22:06 编辑
这个是重启机器后的结果
--------After system restarted the performance shows as below----
Syscall 1273 Writech 233
CPU User% Kern% Wait% Idle% Reads 18 Rawin 0
ALL 0.0 0.5 0.0 99.5 Writes 0 Ttyout 233
Forks 0 Igets 0
Network KBPS I-Pack O-Pack KB-In KB-Out Execs 0 Namei 1
en4 0.6 3.5 3.5 0.2 0.4 Runqueue 0.0 Dirblk 0
lo0 0.0 0.0 0.0 0.0 0.0 Waitqueue 0.0
Disk Busy% KBPS TPS KB-Read KB-Writ PAGING MEMORY
hdisk0 0.0 0.0 0.0 0.0 0.0 Faults 257 Real,MB 23552
hdisk1 0.0 0.0 0.0 0.0 0.0 Steals 0 % Comp 11.1
cd0 0.0 0.0 0.0 0.0 0.0 PgspIn 0 % Noncomp 0.0
PgspOut 0 % Client 0.0
Name PID CPU% PgSp Owner PageIn 0
xmwlm 123350 5.4 0.9 root PageOut 0 PAGING SPACE
netm 24900 0.0 0.4 root Sios 0 Size,MB 24576
dtgreet 114906 0.0 1.4 root % Used 0.0
topas 90566 0.0 1.4 root NFS (calls/sec) % Free 100.0
getty 118994 0.0 0.6 root ServerV2 0
gil 25158 0.0 0.9 root ClientV2 0 Press:
xmgc 49176 0.0 0.4 root ServerV3 0 "h" for help
init 1 0.0 0.6 root ClientV3 0 "q" to quit
X 41618 0.0 3.6 root
swapper 4386 0.0 0.4 root
swapper 4654 0.0 0.4 root
rpc.lock 33432 0.0 1.2 root
swapper 4922 0.0 0.4 root
rpc.stat 61914 0.0 1.3 daemon
j2pg 29008 0.0 6.5 root
aioserve 90248 0.0 0.4 root
hostmibd 94354 0.0 1.0 root
psgc 40980 0.0 0.4 root
memgrdd 36882 0.0 0.4 root
memp_rbd 32784 0.0 0.4 root
# vmstat 1 20
System configuration: lcpu=16 mem=23552MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 683098 5327887 0 0 0 0 0 0 9 1312 248 0 0 99 0
0 0 682730 5328255 0 0 0 0 0 0 5 1185 250 0 1 99 0
0 0 703966 5307019 0 0 0 0 0 0 4 1111 230 0 1 99 0
0 0 704334 5306651 0 0 0 0 0 0 9 1200 233 0 1 99 0
0 0 683230 5327755 0 0 0 0 0 0 6 1192 246 0 0 99 0
0 0 682893 5328092 0 0 0 0 0 0 3 1189 240 0 1 99 0
0 0 703997 5306988 0 0 0 0 0 0 8 1105 228 0 1 99 0
0 0 704333 5306652 0 0 0 0 0 0 8 1248 235 0 1 99 0
0 0 683228 5327757 0 0 0 0 0 0 8 1198 244 0 0 99 0
0 0 683058 5327927 0 0 0 0 0 0 3 1177 236 0 1 99 0
0 0 704172 5306813 0 0 0 0 0 0 4 1114 230 0 1 99 0
0 0 704479 5306506 0 0 0 0 0 0 10 1204 233 0 1 99 0
0 0 683377 5327607 0 0 0 0 0 0 10 1213 248 0 0 99 0
0 0 683104 5327880 0 0 0 0 0 0 4 1274 234 0 1 99 0
0 0 704095 5306889 0 0 0 0 0 0 4 1097 225 0 1 99 0
0 0 704207 5306777 0 0 0 0 0 0 3 1109 232 0 0 99 0
0 0 704479 5306505 0 0 0 0 0 0 10 1201 246 0 1 99 0
0 0 683371 5327613 0 0 0 0 0 0 7 1194 243 0 0 99 0
0 0 683122 5327862 0 0 0 0 0 0 6 1188 236 0 1 99 0
0 0 704227 5306757 0 0 0 0 0 0 4 1163 225 0 1 99 0
#
-----After we restarted Weblogic servers the performance shows as below-------
$ date
公元2010年09月27日 星期一 19时22分31秒
Topas Monitor for host: p550a EVENTS/QUEUES FILE/TTY
Mon Sep 27 19:21:12 2010 Interval: 2 Cswitch 348 Readch 49187
Syscall 1633 Writech 945
CPU User% Kern% Wait% Idle% Reads 50 Rawin 0
ALL 0.1 0.6 0.0 99.3 Writes 5 Ttyout 942
Forks 0 Igets 0
Network KBPS I-Pack O-Pack KB-In KB-Out Execs 0 Namei 5
en4 3.1 14.0 11.5 1.2 1.9 Runqueue 0.0 Dirblk 0
lo0 0.0 0.0 0.0 0.0 0.0 Waitqueue 0.0
Disk Busy% KBPS TPS KB-Read KB-Writ PAGING MEMORY
hdisk0 0.0 0.0 0.0 0.0 0.0 Faults 405K Real,MB 23552
hdisk1 0.0 0.0 0.0 0.0 0.0 Steals 0 % Comp 43.3
cd0 0.0 0.0 0.0 0.0 0.0 PgspIn 0 % Noncomp 2.2
PgspOut 0 % Client 2.2
Name PID CPU% PgSp Owner PageIn 0
xmwlm 123350 0.4 0.9 root PageOut 0 PAGING SPACE
netm 24900 0.0 0.4 root Sios 0 Size,MB 24576
dtgreet 114906 0.0 1.4 root % Used 0.0
java 102654 0.0 204.8 weblogic NFS (calls/sec) % Free 100.0
topas 123122 0.0 1.9 weblogic ServerV2 0
java 82908 0.0 204.2 weblogic ClientV2 0 Press:
topas 135680 0.0 2.0 weblogic ServerV3 0 "h" for help
IBM.CSMA 62388 0.0 2.0 root ClientV3 0 "q" to quit
topas 111332 0.0 1.5 root
xmgc 49176 0.0 0.4 root
java 127500 0.0 204.1 weblogic
java 152064 0.0 203.4 weblogic
getty 118994 0.0 0.6 root
init 1 0.0 0.6 root
gil 25158 0.0 0.9 root
java 87030 0.0 260.5 weblogic
X 41618 0.0 3.6 root
swapper 4654 0.0 0.4 root
swapper 4922 0.0 0.4 root
swapper 4386 0.0 0.4 root
Exiting
$ date
公元2010年09月27日 星期一 19时22分31秒
$ vmstat 1 20
系统配置:lcpu=16 mem=23552MB
kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 2630746 3226759 0 0 0 0 0 0 65 4041 424 7 0 92 0
1 0 2630970 3226534 0 0 0 0 0 0 20 7363 425 12 1 87 0
1 0 2631164 3226328 0 0 0 0 0 0 9 8982 297 12 1 88 0
1 0 2631806 3225650 0 0 0 0 0 0 27 4201 325 10 3 87 0
1 0 2635929 3218144 0 0 0 0 0 0 357 23711 1007 7 2 90 1
1 0 2637889 3207240 0 0 0 0 0 0 2033 63138 3783 4 4 89 3
1 0 2641792 3202583 0 0 0 0 0 0 92 69733 1388 11 3 87 0
1 0 2632045 3212349 0 0 0 0 0 0 41 4345 566 7 1 93 0
1 0 2632079 3212315 0 0 0 0 0 0 47 2866 536 1 0 98 0
0 0 2640384 3204010 0 0 0 0 0 0 11 1754 381 0 1 99 0
0 0 2661497 3182897 0 0 0 0 0 0 6 1382 280 0 1 99 0
0 0 2661393 3183001 0 0 0 0 0 0 31 1684 337 0 1 99 0
0 0 2640289 3204105 0 0 0 0 0 0 10 1426 292 0 0 99 0
0 0 2632094 3212282 0 0 0 0 0 0 43 2038 322 0 0 99 0
0 1 2632094 3212282 0 0 0 0 0 0 9 1391 260 0 0 99 0
0 0 2632091 3212285 0 0 0 0 0 0 9 1396 257 0 0 99 0
0 0 2632089 3212287 0 0 0 0 0 0 23 1589 330 0 0 99 0
0 0 2632087 3212289 0 0 0 0 0 0 6 1342 249 0 0 99 0
0 0 2632087 3212289 0 0 0 0 0 0 9 1418 273 0 0 99 0
0 0 2632086 3212290 0 0 0 0 0 0 10 1423 273 0 0 99 0
$
大家注意对比看重启机器前后vmstat 输出里AVM的值相差很远,没搞明白啊,大侠们再仔细瞧瞧呵呵
考虑性能问题,先别考虑这些命令的数据,你先要看应用的反应情况,如果应用客户端程序明显的越来越慢,这时候应该有性能问题存在,然后在收集系统数据分析。所以先看看自己是不是有性能问题,然后再分析。如果访问正常,内存占用100%也没有啥好分析调整的
ps -ef
我觉得不正常吧,问题在于机器未重启前,总共启动了4个WEBLOGIC域 topas里内存Comp 高达98%,交换区也用了25%,机器反应迟钝。
然后为了排查问题,我查了一堆东西,没发现有占内存非常大的应用,就如我帖子上的你也看到了。然后就把WEBLOGIC全部DOWN掉,结果内存释放出来不多,topas里Comp 依旧80%,交换器降到了5%。这时候机器空载,也就是只运行了个操作系统,未启动任何应用,我就没想通为何内存不释放,还是80%左右。
随后重启机器,发现一切正常,机器空载时topas里Comp 为11%左右,启动4个WEBLOGIC服务上升至45%左右,这个我认为正常,就是没想通为何机器重启以前已经把应用都停掉了,光跑个操作系统,内存和交换器占用各位80%,25%左右。另外重启前faults也很高。
有请各位高手都来讨论下啊
你这也正常呀