ADO.NET DataTable/DataRow 线程安全

发布于 2024-09-02 04:12:56 字数 1141 浏览 3 评论 0原文

简介

今天早上，一位用户向我报告，他遇到了我们提供的一些并行执行代码的结果不一致的问题（即，列值有时在不应该为空时出现空值）一个内部框架。这段代码过去运行良好，最近没有被篡改，但它让我思考以下代码片段：

代码示例

lock (ResultTable)
{
    newRow = ResultTable.NewRow();
}

newRow["Key"] = currentKey;
foreach (KeyValuePair<string, object> output in outputs)
{
    object resultValue = output.Value;
    newRow[output.Name] = resultValue != null ? resultValue : DBNull.Value;
}

lock (ResultTable)
{
    ResultTable.Rows.Add(newRow);
}

（不保证编译、手工编辑以掩盖专有信息.)

解释

我们在系统的其他地方也有这种级联类型的锁定代码，它工作得很好，但这是我遇到的第一个与 ADO .NET 交互的级联锁定代码实例。众所周知，框架对象的成员通常不是线程安全的（在这种情况下就是这种情况），但是级联锁定应该确保我们不会同时读取和写入 ResultTable.Rows。我们很安全，对吧？

假设

嗯，级联锁代码并不能确保我们不会同时读取或写入我们分配的 ResultTable.Rows > 值到新行中的列。如果 ADO .NET 使用某种非线程安全的缓冲区来分配列值，即使涉及不同的对象类型（DataTable 与 DataRow），会怎样？

以前有人遇到过这样的事情吗？我想我应该先在 StackOverflow 上问一下，然后再花几个小时苦苦挣扎:)

结论

嗯，共识似乎是，将级联锁更改为完整锁已经解决了问题。这不是我所期望的结果，但是全锁版经过很多很多次的测试并没有出现这个问题。

教训：警惕在您无法控制的 API 上使用的级联锁。谁知道幕后会发生什么！

原文

Introduction

A user reported to me this morning that he was having an issue with inconsistent results (namely, column values sometimes coming out null when they should not be) of some parallel execution code that we provide as part of an internal framework. This code has worked fine in the past and has not been tampered with lately, but it got me to thinking about the following snippet:

Code Sample

lock (ResultTable)
{
    newRow = ResultTable.NewRow();
}

newRow["Key"] = currentKey;
foreach (KeyValuePair<string, object> output in outputs)
{
    object resultValue = output.Value;
    newRow[output.Name] = resultValue != null ? resultValue : DBNull.Value;
}

lock (ResultTable)
{
    ResultTable.Rows.Add(newRow);
}

(No guarantees that that compiles, hand-edited to mask proprietery information.)

Explanation

We have this cascading type of locking code other places in our system, and it works fine, but this is the first instance of cascading locking code that I have come across that interacts with ADO .NET. As we all know, members of framework objects are usually not thread safe (which is the case in this situation), but the cascading locking should ensure that we are not reading and writing to ResultTable.Rows concurrently. We are safe, right?

Hypothesis

Well, the cascading lock code does not ensure that we are not reading from or writing to ResultTable.Rows at the same time that we are assigning values to columns in the new row. What if ADO .NET uses some kind of buffer for assigning column values that is not thread safe--even when different object types are involved (DataTable vs. DataRow)?

Has anyone run into anything like this before? I thought I would ask here at StackOverflow before beating my head against this for hours on end :)

Conclusion

Well, the consensus appears to be that changing the cascading lock to a full lock has resolved the issue. That is not the result that I expected, but the full lock version has not produced the issue after many, many, many tests.

The lesson: be wary of cascading locks used on APIs that you do not control. Who knows what may be going on under the covers!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

娇妻 2024-09-09 04:12:56

艾伦，

我找不到你的方法的任何具体问题，并不是说我的测试是详尽的。以下是我们坚持的一些想法（我们所有的应用程序都以线程为中心）：

只要有可能：

[1] 使所有数据访问完全原子化。由于多线程应用程序中的数据共享是各种不可预见的线程交互的绝佳场所。

[2] 避免锁定类型。如果不知道该类型是否是线程安全的，请编写一个包装器。

[3] 包含允许快速识别正在访问共享资源的线程的结构。如果系统性能允许，请将此信息记录在高于调试级别且低于通常操作日志级别的位置。

[4] 任何未在内部明确记录为经过线程安全测试的代码（包括 System.* 等）都不是线程安全的。道听途说和别人的口头言论不算在内。测试一下并写下来。

希望这有一定的价值。

回复收藏 0 原文

写下不归期 2024-09-09 04:12:56

我曾经读过一篇文章，说他们发现内部使用公共行在数据表中进行插入操作。创建新记录的多个线程都会在公共行上覆盖数据并相互破坏，从而导致问题。解决方法是在添加行时锁定表，以便一次只有一个线程可以添加新行。

回复收藏 0 原文

夜声 2024-09-09 04:12:56

您的代码对我来说也很好，但我建议您在添加新创建的行之前使用 ResultTable.Rows.SyncRoot 进行锁定，以便 ResultTable 对象的其余部分可供其他进程自由访问。

lock (ResultTable.Rows.SyncRoot)

Your code looks fine to me too, but I suggest that you use ResultTable.Rows.SyncRoot for locking before adding the newly created row, so that the rest of the ResultTable object is free to be accessed by other processes.

lock (ResultTable.Rows.SyncRoot)

回复收藏 0 原文

踏雪无痕 2024-09-09 04:12:56

.NET 的这一点在过去七 (!) 年中可能已经发生了变化，但是，为了回答这个问题，从 .NET 4.7.1 开始，列值缓冲的假设是不正确的。查看 corefx/DataRow.cs 中的源代码，问题是围绕 _tempRecord 字段的竞争条件，该字段存储数据表中的行位置。该字段可能会被任何触发调用 BeginEditInternal() 的写入操作修改，其中包括值更新。当两个写入发生冲突时，一个写入可能会遵循另一个设置的 _tempRecord 值，从而更新与预期不同的行。这与Microsoft 文档所述一致任何写入都必须同步（强调）。托尼之前的回答描述了这种行为的一部分。

举个例子，我最近按照上面代码示例中所示的锁定方法通过提高性能来破坏代码。该代码很稳定，运行了 1.5 年，没有出现任何问题，但是，在每秒超过 2000 个新行的情况下，数万次写入中至少有一次始终会出现在错误的行上。

一种可能的修复方法是锁定每个写入，但将它们分组以通过最小化锁定数量来限制性能影响。另一种方法是为每个线程提供自己的表以更新并稍后合并结果。就我而言，性能关键部分一段时间以来一直是从 DataTable 中移出的候选者，因此用更具可扩展性的数据结构重新编码。