WebSphere MQ:消息不断在输入队列和回退队列之间切换

发布于 2024-11-08 18:43:32 字数 440 浏览 4 评论 0原文

逻辑流程是这样的

  1. 消息被发送到输入队列
  2. 调用 ProcessorMDB 的 onMessage() 。在此方法中,完成了多项操作/验证。
  3. 如果出现有害消息(应用程序代码无法处理的消息),则会引发 RuntimeException。
  4. 这应该回滚事务。我们在日志文件中看到了证据。
  5. 有一个用回退队列名称定义的回退阈值
  6. ,一旦达到阈值,消息就会发送到回退队列,
  7. 但立即开始在输入队列和回退队列之间来回移动。
  8. 我们正在使用 MQMON 工具来观察这种奇怪的行为。即使在应用程序服务器(运行 MDB 的地方)关闭后,它几乎也会永远持续下去。
  9. 我们正在使用 Weblogic 10.3.1 和 WebSphere MQ 6.02

任何帮助将不胜感激,看起来我们已经没有想法了。

The logic flow is like this

  1. A message is sent to an input queue
  2. A ProcessorMDB's onMessage() is invoked. Within this method several operations/validations are done
  3. In case of a poison message(msg that application code cannot handle) a RuntimeException is thrown.
  4. This should rollback the transaction. We are seeing evidence in the log file.
  5. There is a backout threshold defined with a backout queue name
  6. once threshold is reached, the message is sent to backout queue
  7. But immediately it starts going back and forth between the input queue and backout queue.
  8. We are using MQMON tool to observe this weird behavior. It continues for ever almost even after the app server(where MDB is running) is shutdown.
  9. We are using Weblogic 10.3.1 and WebSphere MQ 6.02

Any help will be much appreciated, looks like we are running out of ideas.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

橘亓 2024-11-15 18:43:32

这听起来像是同步点问题。如果 QMgr 在工作单元内重新排队消息时发出 COMMIT,它将影响该线程内同步点下的所有消息。如果应用程序在收到有害消息之前执行了多次 PUT 或 GET 调用,则可能会导致严重问题。 QMgr 不会在程序控制之外发出 COMMIT,而是将消息留在工作单元内的回退队列上,并等待程序发出 COMMIT。这可能会导致一些意外的行为,例如您所看到的消息返回输入队列的情况。

如果队列中的另一条消息位于“坏”消息后面,并且同一线程成功处理该消息,则一切都会顺利进行。应用程序对新消息发出 COMMIT,这也会影响 Backout Queue 上的有害消息。但是,如果线程不正常退出(没有显式断开连接或 COMMIT),则事务将回滚,有害消息将返回到输入队列。

处理这个问题的通常方法是输入队列中的下一个好消息(如果事务是批处理的,则为一批消息)将强制提交。然而,在某些情况下,拥有线程没有获得新的工作(可能它正在通过相关 ID 执行 GET),没有任何东西可以推送坏消息。在这些情况下,确保应用程序在结束之前发出 COMMIT 非常重要。实现此目的的一种方法是编写代码以在等待间隔内通过 CORRELID 执行 GET。如果等待间隔到期,应用程序将获得返回代码 2033,然后在关闭线程之前发出 COMMIT。如果由于某种原因回复消息合法地延迟,则 COMMIT 将无效。但是,如果消息到达并已被撤回并重新排队,COMMIT 将使其保留在撤回队列中。

准确了解正在发生的情况的一种方法是对有问题的队列运行跟踪。您可以使用内置跟踪函数 - strmqtrc - 它有更多选项 V7 中的 V6 版本 。但是,如果您想要非常细粒度的控制,可以使用 SupportPac MA0W 中的跟踪出口。使用 MA0W,您可以准确地看到程序进行了哪些 API 调用以及代表其进行的 API 调用。

[编辑] 使用 PMR 中的一些信息更新响应:

以下内容来自 WMQ V7 信息中心:

MessageConsumers 在会话级别以下是单线程的,并且
任何有害消息的重新排队
发生在当前单位内
工作。这并不影响
然而应用程序的操作
当有害消息重新排队时
根据交易或
Client_acknowledge 会话,
重新排队操作本身不会
提交至当前单位
工作由应用程序承担
代码,或者,如果适用的话,
应用程序容器代码。”

因此,如果客户收到有毒消息很重要
后立即提交
退出,建议他们
要么使用该应用程序
服务器设施
(ConnectionConsumer) 可以提交
立即发送消息,或者
另一种移动毒物的机制
来自队列的消息。

以下是 V6V7< /a> 信息中心。由于您使用的是 V6 客户端,因此您需要参考 V6 信息中心。 请注意,对于 V6 客户端,即使使用 ConnectionConsumer,ASF 的信息中心也没有提及能够立即提交有害消息。根据我的理解,这意味着您可能需要升级到 V7 客户端才能获得您正在寻找的行为。我们有兴趣了解 PMR 是否会产生类似的建议。

This sounds like a syncpoint issue. If the QMgr were to issue a COMMIT when a message is requeued inside of a unit of work it would affect all messages under syncpoint inside of that thread. This would cause serious problems if an application had performed several PUT or GET calls prior to hitting the poison message. Rather than issue a COMMIT outside of the program's control, the QMgr just leaves the message on the backout queue inside the unit of work and waits for the program to issue the COMMIT. This can lead to some unexpected behavior such as what you are seeing where a message lands back on the input queue.

If another message is in the queue behind the "bad" one and it is processed successfully by the same thread, everything works out perfectly. The app issues a COMMIT on the new message and this also affects the poison message on the Backout Queue. However if the thread were to exit uncleanly (without an explicit disconnect or COMMIT) then the transaction is rolled back and the poison message is returned to the input queue.

The usual way of dealing with this is that the next good message (or batch of messages if transactions are batched) in the input queue will force the COMMIT. However in some cases where the owning thread gets no new work (perhaps it was performing a GET by Correlation ID) there is nothing to push the bad message through. In these cases, it is important to make sure that the application issues a COMMIT before ending. One way to do this is to write the code to perform the GET by CORRELID with a wait interval. If the wait interval expires, the application would get a return code of 2033 and then issue a COMMIT before closing the thread. If the reply message is legitimately late for whatever reason, the COMMIT will have no effect. But if the message arrived and had been backed out and requeued, the COMMIT will cause it to stay in the Backout Queue.

One way to see exactly what is going on is to run a trace against the queue in question. You can use the built-in trace function - strmqtrc - which has a few more options in V7 than does the V6 version. However if you want very fine grained control you can use the trace exit in SupportPac MA0W. With MA0W you can see exactly what API calls are made by the program and those made on its behalf.

[EDIT] Updating the response with some info from the PMR:

The following is from the WMQ V7 Infocenter:

MessageConsumers are single threaded below the Session level, and
any requeuing of poison messages
takes place within the current unit of
work. This does not affect the
operation of the application, however
when poison messages are requeued
under a transacted or
Client_acknowledge Session, the
requeue action itself will not be
committed until the current unit of
work is committed by the application
code or, if appropriate, the
application container code."

Hence, if it is important for the customer to have poison messages
committed immediately after they are
backed out, it is recommended they
either make use of the Application
Server Facilities
(ConnectionConsumer) which can commit
the message immediately, or
another mechanism to move poison
messages from the queue.

Here is the link to this information in the V6 and V7 Information Centers. Since you are using the V6 client so you would want to refer to the V6 Infocenter. Note that with the V6 client, there is no mention in the Infocenter of ASF being able to commit the poison message immediately, even when using a ConnectionConsumer. The way I read it, this means you probably will need to upgrade to the V7 client to get the behavior you are looking for. Will be interested to see if the PMR results in a similar recommendation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文