SQL、选择和更新
我试图在包含 100000 行的数据库中选择 100 行,然后更新这些行。
问题是我不想为此目的两次访问数据库,因为更新仅将这些行标记为“已读”。
有什么方法可以使用简单的 jdbc 库在 java 中做到这一点吗? (希望不使用存储过程)
更新:好的,这里有一些说明。
有几个相同应用程序的实例在不同的服务器上运行,它们都需要选择根据creation_date列排序的100个“UNREAD”行,读取其中的blob数据,将其写入文件并将该文件ftp到某个服务器。 (我知道史前,但要求就是要求)
读取和更新部分是为了确保每个实例获取不同的数据集。 (按顺序,像奇数和偶数这样的技巧不起作用:/)
我们选择要更新的数据。 数据通过线路传输(我们等待并等待),然后我们将它们更新为“READ”。 然后释放锁进行读取。 整个事情花费的时间太长了。 通过同时读取和更新,我希望减少锁定时间(从我们使用 select 进行更新到实际更新的时间),以便使用多个实例会增加每秒读取的行数。
还有想法吗?
I am trying to select 100s of rows at a DB that contains 100000s of row and update those rows afters.
the problem is I don't want to go to DB twice for this purpose since update only marks those rows as "read".
is there any way I can do this in java using simple jdbc libraries? (hopefully without using stored procedures)
update: ok here is some clarification.
there are a few instance of same application running on different servers, they all need to select 100s of "UNREAD" rows sorted according to creation_date column, read blob data within it, write it to file and ftp that file to some server. (I know prehistoric but requirements are requirements)
The read and update part is for to ensure each instance getting diffent set of data. (in order, tricks like odds and evens wont work :/)
We select data for update. the data transfers through the wire (we wait and wait) and then we update them as "READ". then release lock for reading. this entire thing takes too long. By reading and updating at the same time, I would like to reduce lock time (from time we use select for update to actual update) so that using multiple instances would increase read rows per second.
Still have ideas?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在我看来,解释这个问题的方法可能不止一种。
更新它们的唯一目的是
不读它们。
给某人,并将他们标记为
一次阅读一篇或整组阅读。
它们在您选择的时间已读
他们。
我们首先采用选项 1,因为这似乎是最简单的。 您不需要选择行来更新它们,只需使用 WHERE 子句发出更新:
查看选项 2,您希望在用户读取它们时将它们标记为已读(或者下游系统已收到它,或者其他什么)。 为此,您可能需要进行另一次更新。 如果您查询主键,除了第一次选择中需要的其他列之外,您可能会更容易更新,因为数据库不必进行表或索引扫描来查找行。
在 JDBC (Java) 中,有一个执行批量更新的工具,您可以在其中一次执行一组更新。 当我需要执行大量完全相同形式的更新时,这种方法效果很好。
选项 3,您希望一次性选择并更新所有内容。 就我个人而言,我认为这没有多大用处,但这并不意味着其他人没有。 我想某种存储过程会减少往返次数。 我不确定您在这里使用的数据库是什么,并且无法真正提供具体信息。
It seems to me there might be more than one way to interpret the question here.
sole purpose of updating them and
not reading them.
to somebody, and marking them as
read either one at a time or all as a group.
them as read at the time you select
them.
Let's take Option 1 first, as that seems to be the easiest. You don't need to select the rows in order to update them, just issue an update with a WHERE clause:
Looking at option 2, you want to mark them as read when a user has read them (or a down stream system has received it, or whatever). For this, you'll probably have to do another update. If you query for the primary key, in addition to the other columns you'll need in the first select, you will probably have an easier time of updating, as the DB won't have to do table or index scans to find the rows.
In JDBC (Java) there is a facility to do a batch update, where you execute a set of updates all at once. That's worked out well when I need to perform a lot of updates that are of the exact same form.
Option 3, where you want to select and update all in one shot. I don't find much use for this, personally, but that doesn't mean others don't. I suppose some kind of stored procedure would reduce the round trips. I'm not sure what db you are working with here and can't really offer specifics.
去数据库也没那么糟糕。 如果您没有“通过网络”返回任何内容,那么更新不会对您造成太大损害,而且只有几十万行。 你担心什么?
Going to the DB isn't so bad. If you aren't returning anything 'across the wire' then an update shouldn't do you too much damage and its only a few hundred thousand rows. What is your worry?
如果您在 JDBC 中执行 SELECT 并迭代 ResultSet 以更新每一行,那么您就做错了。 这是一个永远不会表现良好的 (n+1) 查询问题。
只需使用 WHERE 子句执行 UPDATE 即可确定需要更新哪些行。 这样就形成了单网络往返。
不要太以代码为中心。 让数据库完成其设计的任务。
If you're doing a SELECT in JDBC and iterating over the ResultSet to UPDATE each row, you're doing it wrong. That's an (n+1) query problem that will never perform well.
Just do an UPDATE with a WHERE clause that determines which of those rows needs to be updated. It's a single network round trip that way.
Don't be too code-centric. Let the database do the job it was designed for.
不能只使用同一个连接而不关闭它吗?
Can't you just use the same connection without closing it?