websphere mq 中的队列存储文件系统已满
我们遇到过这样的场景:在linux环境下,磁盘空间被空队列占用。
由于文件系统已满,我们的队列管理器意外结束,我们需要清空 q 文件以恢复队列管理器。
但实际上我们队列中根本没有任何消息。这显示了一个特定的队列。
为什么磁盘空间保存在这里?根本原因是什么?
We come across a scenario where disk space was occupied for empty queues in linux environment.
Our queue manager ended unexpectedly as the file system become full and we need to empty the q file to bring back the queue manager.
But actually we dont have any messages at all in queue. This is showing a particularl queue.
Why the disk space is held here? what is the root cause?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
WMQ 不会实时收缩队列文件。例如,队列中有 100 条消息,您使用了第一条消息。然后,WMQ 不会收缩文件并将所有消息上移一位。如果它尝试对每条消息执行此操作,您将永远无法获得当前在产品中看到的吞吐量。
实际发生的情况是,WMQ 将在处理生命周期中的某些点收缩队列文件。队列变空和队列下的文件缩小队列之间存在一些延迟,但这种延迟通常很小,以至于无法察觉。
您所描述的事件理论上可能会在某些非常特定的条件下发生,但这种情况极为罕见。事实上,在我使用 WMQ 的 15 年里,我只见过几个实例,其中收缩队列文件的延迟甚至是明显的。我猜想这里实际发生的事情是你的假设或观察之一是错误的。例如:
队列实际上是空的吗?
它实际上是填满文件系统的队列文件吗?
甚至是MQ吗?
我们经常看到的问题之一是应用程序尝试将超过 5,000 条消息放入队列并收到 QFULL 错误。大多数人做的第一件事就是设置 MAXDEPTH(999999999) 以确保这种情况永远不会再次发生。问题是 QFULL 是一个软错误,应用程序可以从中恢复,但填满文件系统是一个硬错误,可能会导致整个 QMgr 崩溃。设置 MAXDEPTH(999999999) 会将可管理的软错误换成致命错误。 MQ 管理员有责任确保队列上的 MAXDEPTH 和 MAXMSGL 设置为不会填满底层文件系统。在大多数商店中,对所有文件系统都进行了额外的监控,以便在文件系统填满之前发出警报。
所以综上所述,WMQ在大多数情况下在收缩队列文件方面做得非常好。特别是,当队列清空时,这是可以收缩文件的自然同步点,并且这通常发生在队列清空后的几秒钟内。您要么遇到了罕见的竞争情况,其中文件收缩得不够快,要么这里发生了其他事情,在您的初始分析中并不明显。在任何情况下,管理 MAXDEPTH 和 MAXMSGL 以便没有队列可以填满文件系统并编写代码来处理 QFULL 条件。
WMQ does not shrink the queue files in real time. For example, you have 100 messages on a queue and you consume the first one. WMQ does not then shrink the file and move all the messages up by one position. If it tried to do that for each message, you'd never be able to get the throughput that you currently see in the product.
What does occur is that WMQ will shrink the queue files at certain points in the processing lifecycle. There is some latency between a queue becoming empty and the file under it shrinking it but this latency is normally so small as to be unnoticeable.
The event you are describing could in theory occur under some very specific conditions however it would be an extremely rare. In fact in the 15 years I've been working with WMQ I've only ever seen a couple of instances where the latency in shrinking a queue file was even noticeable. I would guess that what is actually going on here is that one of your assumptions or observations is faulty. For example:
Was the queue actually empty?
Was it actually the queue file that filled up the file system?
Was it even MQ?
One of the issues that we see very frequently is that an application will try to put more than 5,000 messages on a queue and receive a QFULL error. The very first thing most people then do is set MAXDEPTH(999999999) to make sure this NEVER happens again. The problem with this is that QFULL is a soft error from which an application can recover but filling up the filesystem is a hard error which can bring down the entire QMgr. Setting MAXDEPTH(999999999) trades a manageable soft error for a fatal error. It is the responsibility of the MQ administrator to make sure that MAXDEPTH and MAXMSGL on the queues are set such that the underlying filesystem does not fill. In most shops additional monitoring is in place on all the filesystems to raise alerts well before they fill.
So to sum up, WMQ does a very good job of shrinking queue files in most cases. In particular, when a queue empties this is a natural point of synchronization at which the file can be shrunk and this usually occurs within seconds of the queue emptying. You have either hit a rare race condition in which the file was not shrunk fast enough or there is something else going on here that is not readily apparent in your initial analysis. In any case, manage MAXDEPTH and MAXMSGL such that no queue can fill up the filesystem and write the code to handle QFULL conditions.