我知道快照隔离可以解决这个问题,但我想知道在这种特定情况下 NOLOCK 是否安全,以便我可以避免开销。
我有一个看起来像这样的表:
drop table Data
create table Data
(
Id BIGINT NOT NULL,
Date BIGINT NOT NULL,
Value BIGINT,
constraint Cx primary key (Date, Id)
)
create nonclustered index Ix on Data (Id, Date)
该表永远没有更新。可以发生删除,但它们永远不应该与 SELECT 竞争,因为它们会影响表的另一端(较旧的一端)。插入是有规律的,并且页面分割到(Id,Date)索引是非常常见的。
我在标准 INSERT 和 SELECT 之间出现死锁情况,如下所示:
select top 1 Date, Value from Data where Id = @p0 order by Date desc
因为 INSERT 获取 Cx (Date, Id; Value) 上的锁,然后获取 Ix (Id, Date) 上的锁,但 SELECT 获取 Ix ( Id、日期),然后是 Cx(日期、Id;值)。这是因为 SELECT 首先在 Ix 上查找,然后加入到 Cx 上的查找。
交换聚集索引和非聚集索引会打破这个循环,但这不是一个可接受的解决方案,因为它会引入与其他(更复杂的)SELECT 的循环。
如果我在 SELECT 中添加 NOLOCK,在这种情况下会出错吗?它是否可以返回:
- 不止一行,即使我要求 TOP 1?
- 没有行,即使存在并已提交?
- 最糟糕的是,一行不满足 WHERE 子句?
我已经在网上阅读了很多有关此内容的文章,但我见过的唯一过多或过少异常的复制品(一个,两个)涉及扫描。这仅涉及寻找。 Jeff Atwood 有一篇关于使用 NOLOCK 的帖子,引起了热烈的讨论。我对 Rick Townsend 的评论特别感兴趣:
其次,如果读取脏数据,
您面临的风险是阅读
完全错误的行。例如,如果
您的选择读取索引来查找
你的行,那么更新会改变
行的位置(例如:由于
页面拆分或更新
聚集索引),当您选择
去读取实际的数据行,它是
要么不再存在,要么不同
一起排!
仅插入而不更新是否可以实现?如果是这样,那么我想即使是我在仅插入表上的查找也可能是危险的。
更新:
我试图弄清楚 快照隔离如何工作。它似乎是基于行的,事务读取表(没有共享锁!),找到它们感兴趣的行,然后查看是否需要从 tempdb 中的版本存储中获取该行的旧版本。
但就我而言,没有一行会有多个版本,因此版本存储似乎毫无意义。如果发现该行没有共享锁,那么与仅使用 NOLOCK 有什么不同?
I know that snapshot isolation would fix this problem, but I'm wondering if NOLOCK is safe in this specific case so that I can avoid the overhead.
I have a table that looks something like this:
drop table Data
create table Data
(
Id BIGINT NOT NULL,
Date BIGINT NOT NULL,
Value BIGINT,
constraint Cx primary key (Date, Id)
)
create nonclustered index Ix on Data (Id, Date)
There are no updates to the table, ever. Deletes can occur but they should never contend with the SELECT because they affect the other, older end of the table. Inserts are regular and page splits to the (Id, Date) index are extremely common.
I have a deadlock situation between a standard INSERT and a SELECT that looks like this:
select top 1 Date, Value from Data where Id = @p0 order by Date desc
because the INSERT acquires a lock on Cx (Date, Id; Value) and then Ix (Id, Date), but the SELECT acquires a lock on Ix (Id, Date) and then Cx (Date, Id; Value). This is because the SELECT first seeks on Ix and then joins to a seek on Cx.
Swapping the clustered and non-clustered index would break this cycle, but it is not an acceptable solution because it would introduce cycles with other (more complex) SELECTs.
If I add NOLOCK to the SELECT, can it go wrong in this case? Can it return:
- More than one row, even though I asked for TOP 1?
- No rows, even though one exists and has been committed?
- Worst of all, a row that doesn't satisfy the WHERE clause?
I've done a lot of reading about this online, but the only reproductions of over- or under-count anomalies I've seen (one, two) involve a scan. This involves only seeks. Jeff Atwood has a post about using NOLOCK that generated a good discussion. I was particularly interested in a comment by Rick Townsend:
Secondly, if you read dirty data, the
risk you run is of reading the
entirely wrong row. For example, if
your select reads an index to find
your row, then the update changes the
location of the rows (e.g.: due to a
page split or an update to the
clustered index), when your select
goes to read the actual data row, it's
either no longer there, or a different
row altogether!
Is this possible with inserts only, and no updates? If so, then I guess even my seeks on an insert-only table could be dangerous.
Update:
I'm trying to figure out how snapshot isolation works. It seems to be row-based, where transactions read the table (with no shared lock!), find the row they are interested in, and then see if they need to get an old version of the row from the version store in tempdb.
But in my case, no row will have more than one version, so the version store seems rather pointless. And if the row was found with no shared lock, how is it different to just using NOLOCK?
发布评论
评论(2)
使用 NOLOCK 或 READ UNCOMMITTED 意味着您放弃任何一致性保证。时期。
如果需要一致性,请不要进行脏读。您的整个解释依赖于未记录的行为,这些行为可能会在未来版本中发生变化,更糟糕的是,依赖于您期望查询的特定访问计划。查询优化器可以自由选择它认为合适的任何计划,并且您所做的任何假设都可能在生产中被破坏。所以回到第一个方面:如果你没有准备好面对后果,就不要做脏读。
不确定这是否适用,不清楚您尝试通过查询/表实现什么目标,但也许这篇文章可能会有所帮助: 使用表作为队列。
已更新
如果 NOLOCK 读取会读取不一致的状态(例如,读取过时的非聚集索引键并将其追踪到聚集索引中的缺失行),则快照读取会在版本存储中找到“缺失”行。对于稳定的数据,快照读取与无锁读取相同。每当数据发生更改(未提交的更新)时,版本存储的魔力就会发挥作用,因为快照读取会进入版本存储并找到“旧”值(稳定且一致),而 nolock 读取会在其中陷入混乱并追逐指针拉拉地。
Using NOLOCK or READ UNCOMMITTED means you give up any guarantee of consistency. Period.
If you need consistency, don't do dirty reads. Your whole explanation relies on undocumented behavior subject to change in future releases and, far worse, on specific access plans you expect for your query. The Query Optimizer is free to choose any plan it sees fit and any assumption you make it may be broken in production. So is back to square one: don't do dirty reads if you're not prepared to face the consequences.
Not sure if this applies, is not clear what you try to achieve with your query/table, but perhaps this article may help: Using tables as Queues.
Updated
Where a NOLOCK read would read an inconsistent state (eg. read a stale non-clustered index key and chase it to a missing row in the clustered index) a snapshot read would find the 'missing' row in the version store. For stable data, a snapshot read is identical with a nolock read. The magic of the version store comes into play whenever data is changed (uncommitted updates), because the snapshot read goes into the version store and finds the 'old' values (stable and consistent) where the nolock read would go haywire and chase pointers into lala land.
在这种情况下,使用 NOLOCK 应该是安全的。还有一个想法:将 Value 添加为 Ix 索引上的包含列应该可以消除 Cx 上的查找。
You should be safe with the NOLOCK in this case. One additional thought: Adding Value as an included column on the Ix index should eliminate the seek on Cx.