如何确定分布式事务超时的原因

发布于 2024-08-10 23:04:05 字数 5244 浏览 1 评论 0 原文

我正在使用 LINQ to SQL 和支持分布式事务的第三方 SDK。当我意识到挂起的更新将更新 SQL 记录和第三方 SDK 中的记录时,我正在创建一个超时为 0(可能是无限)的 TransactionScope(尽管我也尝试过 12 小时作为时间跨度参数)。然后,我对环境事务(由 transactionscope 创建)使用 GetDtcTransaction 来获取 DTC 事务以链接到第三方 SDK。大约 10 分钟内一切正常,但 10 分钟后,事务消失并出现错误。我如何确定交易消失的原因。我怀疑这是超时,因为它经常在 10 分钟后发生,尽管此时已完成的工作程度略有不同。但我不知道如何确定终止交易的原因、原因以及如何延长其寿命。

我尝试使用 SQL 事件探查器跟踪以下事件:

  1. 所有错误和警告事件
  2. 除了“审核架构对象”事件之外的所有安全事件
  3. 除了 SQLTransaction 和 TransactionLog 事件之外的所有事务事件

我在错误发生期间得到的所有事件都是这些事件:

<Event id="19" name="DTCTransaction">
  <Column id="3" name="DatabaseID">1</Column>
  <Column id="11" name="LoginName">sa</Column>
  <Column id="35" name="DatabaseName">master</Column>
  <Column id="51" name="EventSequence">167065</Column>
  <Column id="12" name="SPID">10</Column>
  <Column id="60" name="IsSystem">1</Column>
  <Column id="1" name="TextData">{D662BBC4-21EC-436D-991C-DCB061A34782}</Column>
  <Column id="21" name="EventSubClass">16</Column>
  <Column id="25" name="IntegerData">0</Column>
  <Column id="41" name="LoginSid">01</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="2" name="BinaryData">C4BB62D6EC216D43991CDCB061A34782</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:32.82-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="50" name="XactSequence">0</Column>
</Event>
<Event id="33" name="Exception">
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">sa</Column>
  <Column id="31" name="Error">1222</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167066</Column>
  <Column id="12" name="SPID">19</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="60" name="IsSystem">1</Column>
  <Column id="1" name="TextData">Error: 1222, Severity: 16, State: 18</Column>
  <Column id="41" name="LoginSid">01</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:34.717-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">18</Column>
  <Column id="50" name="XactSequence">0</Column>
</Event>
<Event id="33" name="Exception">
  <Column id="31" name="Error">8525</Column>
  <Column id="8" name="HostName">MARTY755</Column>
  <Column id="12" name="SPID">55</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="64" name="SessionLoginName">fse</Column>
  <Column id="1" name="TextData">Error: 8525, Severity: 16, State: 1</Column>
  <Column id="9" name="ClientProcessID">2516</Column>
  <Column id="41" name="LoginSid">DB2744F54B5CDB4A8B9E5CA9C209A7AC</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="10" name="ApplicationName">.Net SqlClient Data Provider</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:37.54-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">1</Column>
  <Column id="50" name="XactSequence">236223201284</Column>
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">fse</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167067</Column>
</Event>
<Event id="162" name="User Error Message">
  <Column id="31" name="Error">8525</Column>
  <Column id="8" name="HostName">MARTY755</Column>
  <Column id="12" name="SPID">55</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="64" name="SessionLoginName">fse</Column>
  <Column id="1" name="TextData">Distributed transaction completed. Either enlist this session in a new transaction or the NULL transaction.</Column>
  <Column id="9" name="ClientProcessID">2516</Column>
  <Column id="41" name="LoginSid">DB2744F54B5CDB4A8B9E5CA9C209A7AC</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="10" name="ApplicationName">.Net SqlClient Data Provider</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:37.54-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">1</Column>
  <Column id="50" name="XactSequence">236223201284</Column>
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">fse</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167068</Column>
</Event>

EventSubClass 16 on DTCTransaction 事件指示“事务正在中止”。

I am using LINQ to SQL and a third party SDK that supports distributed transactions. When I realize that a pending update will be updating both SQL records and records in the third party SDK, I am creating a TransactionScope with a 0 (presumably infinite) timeout (although I've also tried 12 hours as a timespan parameter). Then I use GetDtcTransaction on the ambient transaction (created by transactionscope) to get a DTC transaction to link to the third party SDK. Things work nicely for about 10 minutes, but after 10 minutes, the transaction disappears and an error occurs. How do I determine why the transaction is disappearing. I suspect it's a timeout because it regularly occurs after 10 minutes even though slightly varying degrees of work have been done at that point. But I'm at a loss about how to determine what terminated the transaction, why, and how to extend its life.

I've tried tracingthe following events with SQL profiler:

  1. All error and warning events
  2. All Security events except "Audit Schema Object" events
  3. All Transaction events except SQLTransaction and TransactionLog events

All I get around the time of the error are these events:

<Event id="19" name="DTCTransaction">
  <Column id="3" name="DatabaseID">1</Column>
  <Column id="11" name="LoginName">sa</Column>
  <Column id="35" name="DatabaseName">master</Column>
  <Column id="51" name="EventSequence">167065</Column>
  <Column id="12" name="SPID">10</Column>
  <Column id="60" name="IsSystem">1</Column>
  <Column id="1" name="TextData">{D662BBC4-21EC-436D-991C-DCB061A34782}</Column>
  <Column id="21" name="EventSubClass">16</Column>
  <Column id="25" name="IntegerData">0</Column>
  <Column id="41" name="LoginSid">01</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="2" name="BinaryData">C4BB62D6EC216D43991CDCB061A34782</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:32.82-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="50" name="XactSequence">0</Column>
</Event>
<Event id="33" name="Exception">
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">sa</Column>
  <Column id="31" name="Error">1222</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167066</Column>
  <Column id="12" name="SPID">19</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="60" name="IsSystem">1</Column>
  <Column id="1" name="TextData">Error: 1222, Severity: 16, State: 18</Column>
  <Column id="41" name="LoginSid">01</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:34.717-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">18</Column>
  <Column id="50" name="XactSequence">0</Column>
</Event>
<Event id="33" name="Exception">
  <Column id="31" name="Error">8525</Column>
  <Column id="8" name="HostName">MARTY755</Column>
  <Column id="12" name="SPID">55</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="64" name="SessionLoginName">fse</Column>
  <Column id="1" name="TextData">Error: 8525, Severity: 16, State: 1</Column>
  <Column id="9" name="ClientProcessID">2516</Column>
  <Column id="41" name="LoginSid">DB2744F54B5CDB4A8B9E5CA9C209A7AC</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="10" name="ApplicationName">.Net SqlClient Data Provider</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:37.54-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">1</Column>
  <Column id="50" name="XactSequence">236223201284</Column>
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">fse</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167067</Column>
</Event>
<Event id="162" name="User Error Message">
  <Column id="31" name="Error">8525</Column>
  <Column id="8" name="HostName">MARTY755</Column>
  <Column id="12" name="SPID">55</Column>
  <Column id="20" name="Severity">16</Column>
  <Column id="64" name="SessionLoginName">fse</Column>
  <Column id="1" name="TextData">Distributed transaction completed. Either enlist this session in a new transaction or the NULL transaction.</Column>
  <Column id="9" name="ClientProcessID">2516</Column>
  <Column id="41" name="LoginSid">DB2744F54B5CDB4A8B9E5CA9C209A7AC</Column>
  <Column id="49" name="RequestID">0</Column>
  <Column id="10" name="ApplicationName">.Net SqlClient Data Provider</Column>
  <Column id="14" name="StartTime">2009-11-11T13:55:37.54-06:00</Column>
  <Column id="26" name="ServerName">.</Column>
  <Column id="30" name="State">1</Column>
  <Column id="50" name="XactSequence">236223201284</Column>
  <Column id="3" name="DatabaseID">9</Column>
  <Column id="11" name="LoginName">fse</Column>
  <Column id="35" name="DatabaseName">ACS</Column>
  <Column id="51" name="EventSequence">167068</Column>
</Event>

EventSubClass 16 on the DTCTransaction event indicates "Transaction is aborting".

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

提笔落墨 2024-08-17 23:04:06

要延长超时(未指定时默认为最长 10 分钟),需要更新目标系统上的 C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\CONFIG\Machine.config(在 C:\Windows 下查看) \Microsoft.NET\Framework64\v2.0.50727\CONFIG(如果您运行的是 64 位)。将其添加为根级别下的最后一项:

<system.transactions>
    <machineSettings maxTimeout="23:00:00"/>
</system.transactions>

这将(作为示例)将超时设置为 23 小时。

有效值在System.Transactions.TransactionManager.MaximumTimeout中可见

To extend the timeout, which defaults to 10 minutes maximum when not specified, it's necessary to update C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\CONFIG\Machine.config on the target system (look under C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG if you're running 64-bit). Add this as the last item right under the root level:

<system.transactions>
    <machineSettings maxTimeout="23:00:00"/>
</system.transactions>

This will (as an example) set the timeout to 23 hours.

The effective value is visible in System.Transactions.TransactionManager.MaximumTimeout

恏ㄋ傷疤忘ㄋ疼 2024-08-17 23:04:06

难道是SqlConnection超时而不是分布式事务超时?

更新 1

您可以使用 SQL Server Profiler 来尝试监视意外的连接中断。您只需确保跟踪配置文件只包含您需要监视的事件,因为它的输出可能非常冗长。我首先只监视“审核登录”和“审核注销”事件,这些事件可以在“安全审核”事件类别下找到。

如果您要对独立/仅供您自己使用的 SQL Server 实例以外的任何对象进行分析,您可能需要应用过滤器,以便只有源自您的主机的事件才会出现在输出中。

您可能需要显式指定超时值你的连接字符串 - 将其设置得非常低,看看你是否能更快地获得相同的行为。

更新2

从您的跟踪日志中,我看到两个异常,其详细信息是:

  • 错误:1222,严重性:16,状态:18
  • 错误:8525,严重性:16,状态:1

出现 1222 异常的谷歌搜索 http://www.sqlservercentral.com/Forums/Topic579864-146-1.aspx# bm645422,其中指出:

该错误意味着锁定
在 msdb 中请求并超时。
通常,这意味着它是
大临时表上的大事务或
一大类,或类似的东西。

你有什么特别的
长时间运行的查询可能是
与之相关?或许是沉重的
值班报告,或类似的东西
那?

希望这能让你走得更远。

Could it be the SqlConnection timing out and not the distributed transaction?

Update 1

You could use the SQL Server Profiler to try and monitor for an unexpected connection-drop. You'll just want to make sure to water your trace profile to only include the events you need to monitor as it's output can be quite verbose. I'd start with only monitoring the "Audit Login" and "Audit Logout" events that can be found under the "Security Audit" event category.

If you're profiling on anything other than a stand-alone/used-only-by-yourself SQL Server instance you'll probably want to apply a filter so only events originating from your host show up in the output.

You might want to explicitly specify a timeout value in your connection string - set it really low and see if you get the same behavior a lot quicker.

Update 2

From your trace log, I see two exceptions, whose details are:

  • Error: 1222, Severity: 16, State: 18
  • Error: 8525, Severity: 16, State: 1

The Googling for the 1222 exception turned up http://www.sqlservercentral.com/Forums/Topic579864-146-1.aspx#bm645422, which states:

That error means that a lock was
requested in msdb and timed out.
Usually, that's going to mean it's a
big transaction on a big temp table or
a big sort, or something of that type.

Do you have any particularly
long-running queries that might be
associated with it? Perhaps a heavy
duty report, or something similar to
that?

Hopefully this gets you a little farther.

尬尬 2024-08-17 23:04:06

除了我之外,这对于每个阅读的人来说可能都是显而易见的,但我只是陷入了这个问题并想提及我是如何解决它的。尽管我在 BlueMonkMN 指示的位置修改了文件,但我仍然收到 10 分钟的默认事务超时。由于我运行的是 Windows 7 64 位,.NET 的 machine.config 文件位置位于以下位置:

C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG

请注意“Framework64”文件夹与上面不同。

This is probably obvious to everyone reading except me, but I just got stuck on this issue and wanted to mention how I fixed it. Even though I modified the file at the location indicated by BlueMonkMN, I was still receiving the default transaction timeout of 10 minutes. Since I'm running Windows 7 64 Bit, the machine.config file location for .NET is at the following location:

C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG

Note the "Framework64" folder is different than above.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文