notifyJobStoreJobComplete 方法中的 Quartz 失败

发布于 2025-01-01 07:31:09 字数 1716 浏览 0 评论 0原文

场景:

  1. 我们有一个使用 JDBC Job Store 的调度程序。 Quartz 版本是 2.1.2。
  2. 正在调度的作业也在更新数据库。
  3. quartz 和作业本身的数据库是相同的,并且托管在 MySQL 服务器中。应用表和石英表都存储在同一个数据库中。
  4. 应用程序和quartz 的连接池都不同。在应用程序中,我们使用 spring 进行连接池,而quartz 则被迫通过quartz.properties 使用连接池。 这是quartz.properties 的片段

    org.quartz.dataSource.qzDS.driver = com.mysql.jdbc.Driver
    org.quartz.dataSource.qzDS.URL = jdbc:mysql://localhost:3306/dbname?autoReconnect=true
    org.quartz.dataSource.qzDS.user = dbuser
    org.quartz.dataSource.qzDS.password =dbpassword
    org.quartz.dataSource.qzDS.maxConnections = 30
    org.quartz.datasource.qzDS.validationQuery = 选择 1
    #org.quartz.datasource.qzDS.minEvictableIdleTimeMillis=21600000
    #org.quartz.datasource.qzDS.timeBetweenEvictionRunsMillis=1800000
    #org.quartz.datasource.qzDS.numTestsPerEviction=-1
    #org.quartz.datasource.qzDS.testWhileIdle=true
    org.quartz.datasource.qzDS.debugUnreturnedConnectionStackTraces=true
    org.quartz.datasource.qzDS.unreturnedConnectionTimeout=120
    org.quartz.datasource.qzDS.initialPoolSize=5
    org.quartz.datasource.qzDS.minPoolSize=5
    org.quartz.datasource.qzDS.maxPoolSize=30
    org.quartz.datasource.qzDS.acquireIncrement=5
    org.quartz.datasource.qzDS.maxIdleTime=120
    org.quartz.datasource.qzDS.validateOnCheckout=true
    
  5. 数据库通过两台服务器上的MASTER-MASTER复制进行集群,并且它们通过应用程序和quartz中各处的虚拟IP使用。

  6. Scheduler即quartz也集群在与MySQL集群相同的两台机器上。

问题:

其中一台服务器(到目前为止我们已经遇到了备份服务器机器的问题)在调用notifyJobStoreJobComplete方法时偶尔会抛出数据库连接错误。这会导致作业保持在 BLOCKED 状态,即使作业本身已成功完成,但quartz 无法更新其状态。

问题:

  1. 问题的原因可能是什么?
  2. 如何将 BLOCKED 作业移至 WAITING 状态,以便作业至少可以在下一个计划时间运行。即使有效,直接编辑 QRTZ_SIMPLE_TRIGGERS 表也不是一个好的解决方案。

编辑:提出问题。

Scenario:

  1. We have a scheduler which is using JDBC Job Store. Quartz version is 2.1.2.
  2. The job which is being scheduling is also updating a database.
  3. The database is same for both quartz and the job itself and is hosted in MySQL Server. Both application tables and quartz tables are stored in the same database.
  4. Connection pool is different for both application and quartz. In the application we are using spring for connection pooling and quartz is forced to use connection pooling via quartz.properties.
    Here is the snippet of quartz.properties

    org.quartz.dataSource.qzDS.driver = com.mysql.jdbc.Driver
    org.quartz.dataSource.qzDS.URL = jdbc:mysql://localhost:3306/dbname?autoReconnect=true
    org.quartz.dataSource.qzDS.user = dbuser
    org.quartz.dataSource.qzDS.password =dbpassword
    org.quartz.dataSource.qzDS.maxConnections = 30
    org.quartz.datasource.qzDS.validationQuery = select 1
    #org.quartz.datasource.qzDS.minEvictableIdleTimeMillis=21600000
    #org.quartz.datasource.qzDS.timeBetweenEvictionRunsMillis=1800000
    #org.quartz.datasource.qzDS.numTestsPerEviction=-1
    #org.quartz.datasource.qzDS.testWhileIdle=true
    org.quartz.datasource.qzDS.debugUnreturnedConnectionStackTraces=true
    org.quartz.datasource.qzDS.unreturnedConnectionTimeout=120
    org.quartz.datasource.qzDS.initialPoolSize=5
    org.quartz.datasource.qzDS.minPoolSize=5
    org.quartz.datasource.qzDS.maxPoolSize=30
    org.quartz.datasource.qzDS.acquireIncrement=5
    org.quartz.datasource.qzDS.maxIdleTime=120
    org.quartz.datasource.qzDS.validateOnCheckout=true
    
  5. Database is clustered with MASTER-MASTER replication on two servers and they are being used via virtual IP everywhere in the application and quartz.

  6. Scheduler i.e. quartz is also clustered on the same two machines where MySQL is clustered.

The problem:

One of the servers (till now we have got the problem with backup server machine) is occasionally throwing database connection error while calling notifyJobStoreJobComplete method. This is causing the job to stay in BLOCKED state even if the job itself has successfully completed but quartz was unable to update its status.

Questions:

  1. What can be the cause of the problem?
  2. How to move the BLOCKED jobs into WAITING state so that the jobs can be run on their next scheduled time at least. Direct editing the QRTZ_SIMPLE_TRIGGERS tables would not be a good solution, even if it works.

EDIT: To bump up the question.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

九厘米的零° 2025-01-08 07:31:09

notificationJobStoreJobComplete 期间的错误是: org.quartz.impl.jdbcjobstore.JobStoreTX - 无法覆盖连接自动提交/事务隔离。
[java] com.mysql.jdbc.exceptions.jdbc4.CommunicationsException:从服务器成功接收的最后一个数据包是在 619,082,686 毫秒前。最后一次成功发送到服务器的数据包是在 619,082,686 毫秒前。比服务器配置的“wait_timeout”值长。您应该考虑在应用程序中使用之前过期和/或测试连接有效性,增加客户端超时的服务器配置值,或使用 Connector/J 连接属性“autoReconnect=true”来避免此问题。

the error during notifyJobStoreJobComplete is: org.quartz.impl.jdbcjobstore.JobStoreTX - Failed to override connection auto commit/transaction isolation.
[java] com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 619,082,686 milliseconds ago. The last packet sent successfully to the server was 619,082,686 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.

情场扛把子 2025-01-08 07:31:09

我认为主要问题是 MySQL 的通信链路故障,我们通过将“wait_timeout”增加到 14 天来解决它,并且由于我们的维护计划每 15 天一次,因此我们重新启动每个 MySQL 服务器就是我们的数据库集群(我们有主-主)复制到位)。通过该方法,此后我们没有遇到任何通信链路故障。事实上,有时我们每隔 15 天就不会重新启动服务器,但仍然没有错误(摸木头)。 :)

至于 Quartz 触发器被锁定在 BLOCKED 状态,我们将quartz 更新到 2.1.4,这可能修复了几乎相同的 问题。在quartz更新之后,我们发现触发器处于BLOCKED状态的频率非常低。

我们仍然无法找出如何在不直接修改石英表的情况下使触发器脱离 BLOCKED 状态。每当我们遇到这个问题时,我们都会手动从 qrtz_fired_triggers 表中删除 BLOCKED 触发器的条目,这样问题就解决了。我认为 Quartz 的企业版可能在某些 Web UI 中具有此功能。

I think main problem was communication link failure by MySQL which we solved it by increasing 'wait_timeout' to 14 days and as our maintenance is scheduled in every 15 days, we restart the each of MySQL server is our DB cluster (We have Master-Master replication in place). With approach we haven't get any communication link failure after that. In fact some time we don't restart the server in every 15 days but still no error(touch wood). :)

And as far as Quartz triggers being locked in BLOCKED state, we updated the quartz to 2.1.4 which possibly has the fix for the almost same problem. After the quartz update, we have faced the triggers being in BLOCKED state very very less frequent.

We are still unable to find out how to get the trigger out of BLOCKED state without directly modifying the quartz tables. Whenever we face this problem, we manually remove the entry for BLOCKED trigger from the qrtz_fired_triggers table and it solves the problem. I think enterprise version of quartz may have this feature from some web UI.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文