SQLite:在没有游标的情况下批量更新字段

发布于 2024-11-05 20:14:21 字数 1729 浏览 0 评论 0原文

我有下表:

CREATE TABLE Records (
RecordIndex INTEGER NOT NULL,
...
Some other fields
...
Status1 INTEGER NOT NULL,
Status2 INTEGER NOT NULL,
UpdateDate DATETIME NOT NULL,
CONSTRAINT PK_Records PRIMARY KEY (RecordIndex ASC))

和一个索引:

CREATE INDEX IDX_Records_Status ON ClientRecords
  (Status1 ASC, Status2 ASC, RecordIndex ASC)

我需要逐条获取某个状态的记录,所以我使用了这个语句:

SELECT *
FROM RECORDS
WHERE RecordIndex > @PreviousIndex
AND Status1 = @Status1
AND Status2 = @Status2
LIMIT 1

但是现在我需要获取按另一个字段排序的记录,但是这个字段不是唯一的每一条记录,所以我不能以同样的方式使用它。因此,我决定向表中添加一个新的 SortIndex 字段。

由于 SQLite 中没有游标,我正在执行以下操作来初始化 SortIndex 的值。
首先,我创建一个临时表:

CREATE TEMP TABLE Sort (
SortIdx INTEGER PRIMARY KEY AUTOINCREMENT,
RecordIdx INTEGER )

然后以正确的排序顺序填充该表:

INSERT INTO Sort
  SELECT NULL, RecordIndex
  FROM Records
  ORDER BY SomeField ASC, RecordIndex ASC

然后在临时表上创建索引:

CREATE INDEX IDX_Sort_RecordIdx ON Sort (RecordIdx ASC)

然后更新 Records 表中的 SortIndex 字段:

UPDATE Records
SET SortIndex =
  (SELECT SortIdx
   FROM Sort
   WHERE RecordIdx = RecordIndex)

然后删除临时表:

DROP TABLE Sort

最后创建我的记录表上的新索引

CREATE INDEX IDX_Records_Sort ON Records
  (Status1 ASC, Status2 ASC, SortIndex ASC)

这允许我执行以下选择

SELECT *
FROM Records
WHERE SortIndex > @PreviousSortIndex
AND Status1 = @Status1
AND Status2 = @Status2
LIMIT 1

问题是,由于该表包含大约 500K 记录,整个过程大约需要 2 分钟。使用游标初始化 SortIndex 可能会快得多,但 SQLite 缺乏此功能:(

有没有更快的方法来做到这一点?

提前致谢!

I have the following table:

CREATE TABLE Records (
RecordIndex INTEGER NOT NULL,
...
Some other fields
...
Status1 INTEGER NOT NULL,
Status2 INTEGER NOT NULL,
UpdateDate DATETIME NOT NULL,
CONSTRAINT PK_Records PRIMARY KEY (RecordIndex ASC))

And an Index:

CREATE INDEX IDX_Records_Status ON ClientRecords
  (Status1 ASC, Status2 ASC, RecordIndex ASC)

I need to fetch the records of a certain status one by one, so i used this statement:

SELECT *
FROM RECORDS
WHERE RecordIndex > @PreviousIndex
AND Status1 = @Status1
AND Status2 = @Status2
LIMIT 1

But now I need to fetch the records sorted by another field, but this field is not unique for each record, so I can not use it in the same way. So I decided to add a new SortIndex field to my table.

As there are no cursors in SQLite, I am doing the following to initialize the values for SortIndex.
First I create a temporary table:

CREATE TEMP TABLE Sort (
SortIdx INTEGER PRIMARY KEY AUTOINCREMENT,
RecordIdx INTEGER )

Then I fill this table in the correct sort order:

INSERT INTO Sort
  SELECT NULL, RecordIndex
  FROM Records
  ORDER BY SomeField ASC, RecordIndex ASC

Then I create an index on the temporary table:

CREATE INDEX IDX_Sort_RecordIdx ON Sort (RecordIdx ASC)

Then I update the SortIndex field in my Records table:

UPDATE Records
SET SortIndex =
  (SELECT SortIdx
   FROM Sort
   WHERE RecordIdx = RecordIndex)

Then I drop the temporary table:

DROP TABLE Sort

And finaly I create a new index on my Records table

CREATE INDEX IDX_Records_Sort ON Records
  (Status1 ASC, Status2 ASC, SortIndex ASC)

This allows me to do the following select

SELECT *
FROM Records
WHERE SortIndex > @PreviousSortIndex
AND Status1 = @Status1
AND Status2 = @Status2
LIMIT 1

The problem is, as the table contains around 500K records the whole thing takes around 2 minutes. It would probably have been a lot faster to initialize SortIndex with a cursor, but SQLite lacks this feature :(

Is there a faster way to do this ?

Thanks in advance !

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

罪#恶を代价 2024-11-12 20:14:21

您应该考虑 INSERT OR REPLACE,而不是对相关子查询执行 UPDATE SQLite 的 功能,当主键重复时,它将执行整行的 UPDATE

UPDATE Records
   SET SortIndex =
       (SELECT SortIdx
          FROM Sort
         WHERE RecordIdx = RecordIndex) 

变成

INSERT OR REPLACE INTO Records (RecordIndex, SortIndex, ...)
SELECT RecordIndex, SortIdx, ... FROM another_temporary_table_containing_all_columns.

当然,您可以代替使用包含所有列的临时表使用连接旧表和新表的 SELECT:在 SQLite shell 中尝试此

CREATE TABLE original (id INTEGER PRIMARY KEY, content TEXT);

BEGIN TRANSACTION;
INSERT INTO original(id, content) VALUES(1, 'foo');
INSERT INTO original(id, content) VALUES(2, 'bar');
INSERT INTO original(id, content) VALUES(3, 'baz');
COMMIT TRANSACTION;

CREATE TABLE id_remap(old_id INTEGER, new_id INTEGER);

BEGIN TRANSACTION;
INSERT INTO id_remap(old_id, new_id) VALUES(2,3);
INSERT INTO id_remap(old_id, new_id) VALUES(3,2);
COMMIT TRANSACTION;

INSERT OR REPLACE INTO original (id, content)
SELECT b.new_id, a.content
  FROM original a
 INNER JOIN id_remap b
    ON b.old_id = a.id;

SELECT * FROM original;

操作 结果:

1|foo
2|baz
3|bar

如果您需要进行批量更新但不想要相关子查询,另一种选择是在视图中执行连接,并创建一个在该视图上触发INSTEAD OF UPDATE。问题是,在此过程中不能有失败的约束。我想对每一行都检查约束,所以这可能会非常慢。

在 SQLite shell 中:

CREATE TABLE original (id INTEGER PRIMARY KEY, content TEXT);

BEGIN TRANSACTION;
INSERT INTO original(id, content) VALUES(1, 'foo');
INSERT INTO original(id, content) VALUES(2, 'bar');
INSERT INTO original(id, content) VALUES(3, 'baz');
COMMIT TRANSACTION;

CREATE TABLE id_remap(old_id INTEGER, new_id INTEGER);

BEGIN TRANSACTION;
INSERT INTO id_remap(old_id, new_id) VALUES(3,6);
COMMIT TRANSACTION;

CREATE TEMPORARY VIEW tmp_id_mapping
    AS
SELECT a.content, b.old_id, b.new_id
  FROM original a
 INNER JOIN id_remap b
    ON b.old_id = a.id;

 CREATE TEMPORARY TRIGGER IF NOT EXISTS tmp_trig_id_remap
INSTEAD OF UPDATE OF content ON tmp_id_mapping
    FOR EACH ROW
  BEGIN
    UPDATE original
       SET id = new.new_id
     WHERE id = new.old_id;
   END;

UPDATE tmp_id_mapping
   SET content = 'hello';

SELECT * FROM original;

结果:

1|foo
2|bar
6|baz

Instead of doing an UPDATE with a correlated subquery, you should consider the INSERT OR REPLACE feature of SQLite, which will perform an UPDATE of a whole row when the primary key is a duplicate:

UPDATE Records
   SET SortIndex =
       (SELECT SortIdx
          FROM Sort
         WHERE RecordIdx = RecordIndex) 

becomes

INSERT OR REPLACE INTO Records (RecordIndex, SortIndex, ...)
SELECT RecordIndex, SortIdx, ... FROM another_temporary_table_containing_all_columns.

Instead of using a temporary table containing all columns you can of course use a SELECT which joins the old table and the new one: try this inside the SQLite shell

CREATE TABLE original (id INTEGER PRIMARY KEY, content TEXT);

BEGIN TRANSACTION;
INSERT INTO original(id, content) VALUES(1, 'foo');
INSERT INTO original(id, content) VALUES(2, 'bar');
INSERT INTO original(id, content) VALUES(3, 'baz');
COMMIT TRANSACTION;

CREATE TABLE id_remap(old_id INTEGER, new_id INTEGER);

BEGIN TRANSACTION;
INSERT INTO id_remap(old_id, new_id) VALUES(2,3);
INSERT INTO id_remap(old_id, new_id) VALUES(3,2);
COMMIT TRANSACTION;

INSERT OR REPLACE INTO original (id, content)
SELECT b.new_id, a.content
  FROM original a
 INNER JOIN id_remap b
    ON b.old_id = a.id;

SELECT * FROM original;

Result:

1|foo
2|baz
3|bar

Another option if you need to do mass updates but do not want a correlated subquery is to perform the join in a view, and to create a trigger INSTEAD OF UPDATE on that view. A problem is that you cannot have constraints that fail during the process. I suppose that the constraints are checked for each row so that might be very slow.

In the SQLite shell:

CREATE TABLE original (id INTEGER PRIMARY KEY, content TEXT);

BEGIN TRANSACTION;
INSERT INTO original(id, content) VALUES(1, 'foo');
INSERT INTO original(id, content) VALUES(2, 'bar');
INSERT INTO original(id, content) VALUES(3, 'baz');
COMMIT TRANSACTION;

CREATE TABLE id_remap(old_id INTEGER, new_id INTEGER);

BEGIN TRANSACTION;
INSERT INTO id_remap(old_id, new_id) VALUES(3,6);
COMMIT TRANSACTION;

CREATE TEMPORARY VIEW tmp_id_mapping
    AS
SELECT a.content, b.old_id, b.new_id
  FROM original a
 INNER JOIN id_remap b
    ON b.old_id = a.id;

 CREATE TEMPORARY TRIGGER IF NOT EXISTS tmp_trig_id_remap
INSTEAD OF UPDATE OF content ON tmp_id_mapping
    FOR EACH ROW
  BEGIN
    UPDATE original
       SET id = new.new_id
     WHERE id = new.old_id;
   END;

UPDATE tmp_id_mapping
   SET content = 'hello';

SELECT * FROM original;

Result:

1|foo
2|bar
6|baz
凉栀 2024-11-12 20:14:21

主要答案

我认为,不可能在 SQLlite 中快速插入带有索引的约 500k 记录(以及将来的许多索引)。

我希望有人能在这里发明新轮子。


马克,我认为你应该避免这种类型的动态添加索引,而只添加其他经典索引,无论你有多少需要它们。

另外,游标在任何 DMBS 中并不总是好主意——只有当我们需要复杂的逻辑时,但在这里简单的排序,我认为它已经过头了。

只需添加经典索引——即使它们不是唯一的。

或者在这里发布更多关于为什么你填写的详细信息,你应该选择一些动态的方式。

正如我看到的,sqlite也支持偏移量。


用于测试的 SQL

-- init
CREATE TABLE IF NOT EXISTS `records` (
  `RecordID` int(10) default NULL,
  `Status` int(10) default NULL,
  `SomeField` char(50) default NULL,
  `RecordIndex` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

truncate `records`;
INSERT INTO `records` (`RecordID`, `Status`, `SomeField`, `RecordIndex`) VALUES
    (1, 1, 'a', 35),
    (2, 1, 'b', 20),
    (3, 1, 'c', 42);

-- 1st select
SELECT * FROM records WHERE Status = 1 ORDER BY SomeField ASC, RecordIndex ASC LIMIT 1 OFFSET 0;

-- update
update records set `Status` = 2 where RecordID = 1;

-- select next
SELECT * FROM records WHERE Status = 1 ORDER BY SomeField ASC, RecordIndex ASC LIMIT 1 OFFSET 1;

Main answer

I think, that this impossible to have quick insert in SQLlite ~500k records with indexes (and many indexes in future).

I hope someone will invent new wheel here.


Mark, I think you should avoid this type of dynamically added indexes and just add other classic indexes, no matter how much do you need them.

Also cursors not always good idea in any DMBS -- only if we need complex logic, but here in simple order by, I think it`s over-ing.

Just add classic indexes -- even if they non unique.

Or post here more details about why exactly you fill, that you should choose some dynamic way.

Also sqlite, as I see, supports offset.


SQL for tests

-- init
CREATE TABLE IF NOT EXISTS `records` (
  `RecordID` int(10) default NULL,
  `Status` int(10) default NULL,
  `SomeField` char(50) default NULL,
  `RecordIndex` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

truncate `records`;
INSERT INTO `records` (`RecordID`, `Status`, `SomeField`, `RecordIndex`) VALUES
    (1, 1, 'a', 35),
    (2, 1, 'b', 20),
    (3, 1, 'c', 42);

-- 1st select
SELECT * FROM records WHERE Status = 1 ORDER BY SomeField ASC, RecordIndex ASC LIMIT 1 OFFSET 0;

-- update
update records set `Status` = 2 where RecordID = 1;

-- select next
SELECT * FROM records WHERE Status = 1 ORDER BY SomeField ASC, RecordIndex ASC LIMIT 1 OFFSET 1;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文