测试数据库中重复键的最佳方法
这更多的是一个正确性问题。 假设我的数据库中有一个带有主键列的表。 在我的 DAO 代码中,我有一个名为 insertRow(string key) 的函数,如果表中不存在该键,该函数将返回 true,并使用该键插入新行。 否则,如果具有该键的行已存在,则返回 false。 让 insertRow 首先检查键是否存在或直接执行插入并捕获重复键错误是更好/更差? 或者,保存单个 select 语句是否是一种微不足道的优化,甚至无需担心?
所以在 sudo 代码中:
boolean insertRow(String key){
//potentially a select + insert
if(select count(*) from mytable where key = "somekey" == 0){
insert into mytable values("somekey")
return true;
}
return false;
}
或者
boolean insertRow(String key){
try{
//always just 1 insert
insert into mytable values("somekey")
return true;
} catch (DuplicateKeyException ex){}
return false;
}
This is more of a correctness question. Say I have a table with a primary key column in my database. In my DAO code I have a function called insertRow(string key) that will return true if the key doesn't exist in the table and insert a new row with the key. Otherwise, if a row already exists with that key it returns false. Is it better/worse to have insertRow first check for the existence of the key or just go ahead and do the insert and catch the duplicate key error? Or is saving on a single select statement too trivial an optimization to even bother worrying about?
So in sudo code:
boolean insertRow(String key){
//potentially a select + insert
if(select count(*) from mytable where key = "somekey" == 0){
insert into mytable values("somekey")
return true;
}
return false;
}
or
boolean insertRow(String key){
try{
//always just 1 insert
insert into mytable values("somekey")
return true;
} catch (DuplicateKeyException ex){}
return false;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
插入行,捕获重复键错误。 我个人的选择是,
我认为这可能会表现得更好,具体取决于抛出异常的成本与两次访问数据库的成本。
只有测试这两种情况你才能确定
Insert the row, catch the duplicate key error. My personal choice
I reckon this might perform better, depending on the cost of throwing the exception against the cost of hitting the db twice.
Only by testing both scenarios wilil you know for sure
尝试插入,然后捕获错误。
否则,您可能仍然在两个活动 SPID 之间存在并发问题(假设系统上同时有两个 Web 用户),在这种情况下,您无论如何都必须捕获错误
:可以通过使用显式事务或设置事务隔离级别来缓解这一问题,但使用第二种技术更容易,除非您确定始终只有一个应用程序线程针对数据库运行。
Try the insert, then catch the error.
Otherwise, you could still have a concurrency issue between two active SPIDs (lets say two web users on the system at the same time), in which case, you'd have to catch the error anyway:
You can mitigate this by using explicit transactions or setting the transaction-isolation level, but its just easier to use the second technique, unless you are sure only one application thread is running against the database at all times.
在我看来,这是使用异常的一个很好的例子(因为重复是异常的),除非你指望在那里,大多数时候,已经是一行(即,你正在做“插入,但更新” 。
如果代码的目的是更新,那么您应该使用 select 或
INSERT ... ON DUPLICATE KEY UPDATE
子句(如果您的数据库引擎支持) 或者,创建一个存储过程来为您处理此逻辑。In my opinion, this is an excellent case for using exceptions (since the duplicate is exceptional), unless you're counting on there to, most of the time, be a row already (i.e., you're doing "insert, but update if exists" logic.)
If the purpose of the code is to update, then you should either use the select or an
INSERT ... ON DUPLICATE KEY UPDATE
clause (if supported by your database engine.) Alternatively, make a stored procedure that handles this logic for you.第二个是因为第一个选项击中了数据库的两倍,而第二个选项只击中了一次。
Second one because first option hits twice the db while second one just once.
简而言之,您需要自己测试一下。 我的直觉是,做一个小的选择来检查是否存在会表现得更好,但你需要自己验证一下,看看哪个表现更好。
一般来说,我不喜欢将错误检查完全留给异常引擎,无论我正在做什么。 换句话说,如果我可以检查我正在做的事情是否有效,而不仅仅是抛出异常,那么我通常会这样做。
不过,我建议使用
EXISTS
查询而不是count(*)
话虽这么说(从抽象的、与引擎无关的角度来看),我很确定MySQL 有一些关键字,仅当主键不存在时才可用于将行插入表中。 假设您可以使用 MySQL 特定的关键字,这可能是您最好的选择。
另一种选择是将逻辑完全放在 SQL 语句中。
The short answer is that you need to test it for yourself. My gut feeling is that doing a small select to check for the existence will perform better, but you need to verify that for yourself at volume and see whichever performs better.
In general, I don't like to leave my error checking entirely to the exception engine of whatever it is I'm doing. In other words, if I can check to see if what I'm doing is valid rather than just having an exception thrown, that's generally what I do.
I would suggest, however, using an
EXISTS
query rather thancount(*)
All that being said (from an abstract, engine-neutral perspective), I'm pretty sure that MySQL has some keywords that can be used to insert a row into a table only if the primary key doesn't exist. This may be your best bet, assuming you're OK with using MySQL-specific keywords.
Another option would be to place the logic entirely in the SQL statement.
mysql 中的另外两个选项是使用
并
包括
重复键更新 field=field
请参阅:http://dev.mysql.com/doc/refman/5.0/en/insert.html
编辑:
您可以测试affected_rows 来确定插入是否有效果。
another two options in mysql are to use
and
including
on duplicate key update field=field
See: http://dev.mysql.com/doc/refman/5.0/en/insert.html
Edit:
You can test affected_rows for whether or not the insert had an effect or not.
现在我已经在网上找到了 Martin Fowler 的书,一个不错的方法是使用 密钥表 - 请参阅第 222 页了解更多信息。
Now that I've found Martin Fowler's book online, a decent way to do it is with a key table- see pg 222 for more info.