在触发器中插入或忽略

发布于 2024-08-30 16:03:32 字数 3424 浏览 1 评论 0原文

我有一个数据库(用于跟踪电子邮件统计信息)已增长到数百兆字节,并且我一直在寻找减少它的方法。

看来文件大小较大的主要原因是相同的字符串往往会在数千行中重复。为了避免这个问题,我计划为字符串池创建另一个表,如下所示:

CREATE TABLE AddressLookup (
   ID      INTEGER PRIMARY KEY AUTOINCREMENT,
   Address TEXT UNIQUE
);

CREATE TABLE EmailInfo (
   MessageID   INTEGER PRIMARY KEY AUTOINCREMENT,
   ToAddrRef   INTEGER REFERENCES AddressLookup(ID),
   FromAddrRef INTEGER REFERENCES AddressLookup(ID)
   /* Additional columns omitted for brevity. */
);

为了方便起见,一个视图来连接这些表:

CREATE VIEW EmailView AS
SELECT
   MessageID,
   A1.Address AS ToAddr,
   A2.Address AS FromAddr
FROM EmailInfo
   LEFT JOIN AddressLookup A1 ON (ToAddrRef = A1.ID)
   LEFT JOIN AddressLookup A2 ON (FromAddrRef = A2.ID);

为了能够像使用常规表一样使用此视图,我已经做了一些触发器:

CREATE TRIGGER trg_id_EmailView
   INSTEAD OF DELETE ON EmailView
BEGIN
   DELETE FROM EmailInfo WHERE MessageID = OLD.MessageID;
END;

CREATE TRIGGER trg_ii_EmailView
   INSTEAD OF INSERT ON EmailView
BEGIN
   INSERT OR IGNORE INTO AddressLookup(Address) VALUES (NEW.ToAddr);
   INSERT OR IGNORE INTO AddressLookup(Address) VALUES (NEW.FromAddr);
   INSERT INTO EmailInfo
      SELECT NEW.MessageID, A1.ID, A2.ID
      FROM AddressLookup A1, AddressLookup A2
      WHERE A1.Address = NEW.ToAddr AND A2.Address = NEW.FromAddr;
END;

CREATE TRIGGER trg_iu_EmailView
   INSTEAD OF UPDATE ON EmailView
BEGIN
   UPDATE EmailInfo SET MessageID = NEW.MessageID
      WHERE MessageID = OLD.MessageID;
   REPLACE INTO EmailView
      SELECT NEW.MessageID, NEW.ToAddr, NEW.FromAddr;
END;

问题

之后:

INSERT OR REPLACE INTO EmailView VALUES (1, '[email protected]', '[email protected]');
INSERT OR REPLACE INTO EmailView VALUES (2, '[email protected]', '[email protected]');

更新的行包含:

MessageID   ToAddr               FromAddr
---------   ------               --------
1           NULL                 [email protected]
2           [email protected]    [email protected]

有一个不应该存在的 NULL。 EmailInfo 表中的相应单元格包含一个孤立的 ToAddrRef 值。

如果您一次执行一个 INSERT,您将看到 AddressLookup 表中的 Alice 的 ID发生了变化

看来此行为已记录

ON CONFLICT 子句可以指定为触发器体内 UPDATE 或 INSERT 操作的一部分。但是,如果将 ON CONFLICT 子句指定为导致触发器触发的语句的一部分,则将使用外部语句的冲突处理策略。

因此,顶级“INSERT OR REPLACE”语句中的“REPLACE”将覆盖触发器程序中关键的“INSERT OR IGNORE”。

有什么办法可以让它按照我想要的方式工作吗?

I have a database (for tracking email statistics) that has grown to hundreds of megabytes, and I've been looking for ways to reduce it.

It seems that the main reason for the large file size is that the same strings tend to be repeated in thousands of rows. To avoid this problem, I plan to create another table for a string pool, like so:

CREATE TABLE AddressLookup (
   ID      INTEGER PRIMARY KEY AUTOINCREMENT,
   Address TEXT UNIQUE
);

CREATE TABLE EmailInfo (
   MessageID   INTEGER PRIMARY KEY AUTOINCREMENT,
   ToAddrRef   INTEGER REFERENCES AddressLookup(ID),
   FromAddrRef INTEGER REFERENCES AddressLookup(ID)
   /* Additional columns omitted for brevity. */
);

And for convenience, a view to join these tables:

CREATE VIEW EmailView AS
SELECT
   MessageID,
   A1.Address AS ToAddr,
   A2.Address AS FromAddr
FROM EmailInfo
   LEFT JOIN AddressLookup A1 ON (ToAddrRef = A1.ID)
   LEFT JOIN AddressLookup A2 ON (FromAddrRef = A2.ID);

In order to be able to use this view as if it were a regular table, I've made some triggers:

CREATE TRIGGER trg_id_EmailView
   INSTEAD OF DELETE ON EmailView
BEGIN
   DELETE FROM EmailInfo WHERE MessageID = OLD.MessageID;
END;

CREATE TRIGGER trg_ii_EmailView
   INSTEAD OF INSERT ON EmailView
BEGIN
   INSERT OR IGNORE INTO AddressLookup(Address) VALUES (NEW.ToAddr);
   INSERT OR IGNORE INTO AddressLookup(Address) VALUES (NEW.FromAddr);
   INSERT INTO EmailInfo
      SELECT NEW.MessageID, A1.ID, A2.ID
      FROM AddressLookup A1, AddressLookup A2
      WHERE A1.Address = NEW.ToAddr AND A2.Address = NEW.FromAddr;
END;

CREATE TRIGGER trg_iu_EmailView
   INSTEAD OF UPDATE ON EmailView
BEGIN
   UPDATE EmailInfo SET MessageID = NEW.MessageID
      WHERE MessageID = OLD.MessageID;
   REPLACE INTO EmailView
      SELECT NEW.MessageID, NEW.ToAddr, NEW.FromAddr;
END;

The problem

After:

INSERT OR REPLACE INTO EmailView VALUES (1, '[email protected]', '[email protected]');
INSERT OR REPLACE INTO EmailView VALUES (2, '[email protected]', '[email protected]');

The updated rows contain:

MessageID   ToAddr               FromAddr
---------   ------               --------
1           NULL                 [email protected]
2           [email protected]    [email protected]

There's a NULL that shouldn't be there. The corresponding cell in the EmailInfo table contains an orphaned ToAddrRef value.

If you do the INSERTs one at a time, you'll see that Alice's ID in the AddressLookup table changes!

It appears that this behavior is documented:

An ON CONFLICT clause may be specified as part of an UPDATE or INSERT action within the body of the trigger. However if an ON CONFLICT clause is specified as part of the statement causing the trigger to fire, then conflict handling policy of the outer statement is used instead.

So the "REPLACE" in the top-level "INSERT OR REPLACE" statement is overriding the critical "INSERT OR IGNORE" in the trigger program.

Is there a way I can make it work the way that I wanted?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

小ぇ时光︴ 2024-09-06 16:03:32

好吧,由于 SQLite 是开源的,我只需更改 codeTriggerProgram 函数即可按照应有的方式处理ON CONFLICT

Well, since SQLite is open-source, I'll just change the codeTriggerProgram function to handle ON CONFLICT the way it ought to be.

浅语花开 2024-09-06 16:03:32

我有/曾经有过类似的问题,我想我找到了一种方法来欺骗 sqlite 正确执行此操作,方法是将 OR REPLACE 从外部语句移动到表本身:

CREATE TABLE a (id INTEGER PRIMARY KEY ON CONFLICT REPLACE, dataA TEXT);
CREATE TABLE b (id INTEGER PRIMARY KEY, dataB TEXT);

CREATE TRIGGER auto_add AFTER INSERT ON a
    BEGIN INSERT OR IGNORE INTO b (id) VALUES (NEW.id); END;

当您 INSERT OR REPLACE 时 到表“a”中,您总是替换添加到表 b 中的行,因为触发器现在使用 OR REPLACE

但是,如果您只是执行 INSERT INTO ,它似乎可以工作,因为冲突处理不再是外部插入的一部分,而是表本身的一部分。

I have/had a similar problem and I think I found a way to trick sqlite into doing it right by moving the OR REPLACE from the outer statement into the table itself:

CREATE TABLE a (id INTEGER PRIMARY KEY ON CONFLICT REPLACE, dataA TEXT);
CREATE TABLE b (id INTEGER PRIMARY KEY, dataB TEXT);

CREATE TRIGGER auto_add AFTER INSERT ON a
    BEGIN INSERT OR IGNORE INTO b (id) VALUES (NEW.id); END;

When you INSERT OR REPLACE into table "a" you always replace the row added to table b because the trigger uses the OR REPLACE now.

But if you just do an INSERT INTO it seems to work because the on conflict handling is no longer part of the outer insert but part of the table itself.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文