级联软删除

发布于 2024-07-13 06:47:06 字数 197 浏览 9 评论 0原文

SQL一直有一个很棒的特性:级联删除。 您提前计划好,当需要删除某些内容时,砰! 无需担心所有这些相关记录。

然而,现在删除任何东西几乎是一种禁忌。 您将其标记为已删除并停止显示它。 不幸的是,当存在依赖记录时,我无法找到可靠的解决方案来执行此操作。 我总是手动编码复杂的软删除网络。

有没有我完全错过的更好的解决方案?

SQL has always had a great feature: cascading deletes. You plan it in advance and when it's time to delete something, BAM! No need to worry about all those dependent records.

However, nowadays it's almost taboo to actually DELETE anything. You flag it as deleted and stop showing it. Unfortunately, I haven't been able to find a solid solution to doing this when there are dependent records. I've always manually coded the complicated web of soft deletes.

Is there a better solution out there that I have completely missed?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

很糊涂小朋友 2024-07-20 06:47:06

我最近使用 Postgres 9.6 提出了一个级联软删除的解决方案,它利用继承将条目划分为已删除和未删除的条目。 这是我为我们的项目编写的文档的副本:


级联软删除

摘要

在本文档中,我描述了我们当前处理 Postgres 数据库中对象删除的方法,并介绍了当前实现的缺陷。 例如,到目前为止我们还没有能力进行级联软删除。 然后,我展示了一种方法,它结合了 Postgres 的级联硬删除的优势和易于实施、维护的归档方法,并且可以提高所有搜索查询的性能。

关于 GORM 中的软删除

fabric8-services/fabric8-wit 项目中编写在 Go 中,我们为数据库使用面向对象的映射器,称为 GORM

GORM 提供了一种软删除数据库条目的方法:

如果模型有DeletedAt字段,它将自动获得软删除能力! 那么当调用Delete时,它不会从数据库中永久删除,而只是将字段DeletedAt的值设置为当前时间。

假设您有一个模型定义,换句话说,一个 Go 结构体如下所示:

// User is the Go model for a user entry in the database
type User struct {
    ID        int
    Name      string
DeletedAt *time.Time
}

假设您已通过其 ID 从数据库将现有用户条目加载到对象 u< /代码>。

id := 123
u := User{}
db.Where("id=?", id).First(&u)

如果您继续使用 GORM 删除对象:

db.Delete(&u)

在 SQL 中使用 DELETE 不会删除数据库条目,但该行将被更新,并且 deleted_at 设置为当前时间:

UPDATE users SET deleted_at="2018-10-12 11:24" WHERE id = 123;

GORM 中的软删除问题 - 依赖性反转和无级联

上面提到的软删除对于归档单个记录很有用,但它可能会导致依赖于它的所有记录产生非常奇怪的结果。 这是因为 GORM 的软删除不会像 SQL 中潜在的 DELETE 那样级联(如果外键是使用 ON DELETE CASCADE 建模的)。

当您对数据库进行建模时,通常会设计一个表,然后可能会设计另一个表,该表具有第一个表的外键:

CREATE TABLE countries (
    name text PRIMARY KEY,
    deleted_at timestamp
);

CREATE TABLE cities (
    name text,
    country text REFERENCES countries(name) ON DELETE CASCADE,
    deleted_at timestamp
);

在这里,我们建模了引用特定国家/地区的国家/地区和城市的列表。 当您删除国家/地区记录时,所有城市也将被删除。 但由于该表有一个在 Go 结构体中针对国家或城市进行的 deleted_at 列,因此 GORM 映射器只会软删除该国家,而不会影响所属城市。

将责任从 DB 转移到用户/开发者

GORM 从而将其交给开发者来(软)删除所有依赖城市。 换句话说,以前被建模为城市与国家的关系,现在被逆转为国家与城市的关系。 这是因为用户/开发人员现在负责在删除某个国家/地区时(软)删除属于该国家/地区的所有城市。

提案

如果我们能够实现软删除以及 ON DELETE CASCADE 的所有好处,那不是很好吗?

事实证明,我们不需要付出太多的努力就能拥有它。 现在让我们关注一个表,即 countries 表。

存档表

假设我们可以有另一个名为 countries_archive 的表,它与 countries 表具有完全相同的相同结构。 另假设将来对 countries 进行的所有架构迁移都应用于 countries_archive 表。 唯一的例外是唯一约束外键不会应用于countries_archive

我想,这听起来好得令人难以置信,对吧? 好吧,我们可以使用所谓的 Inheritenance 创建这样一个表Postgres:

CREATE TABLE countries_archive () INHERITS (countries);

生成的 countries_archive 表将用于存储 deleted_at IS NOT NULL 的所有记录。

请注意,在我们的 Go 代码中,我们永远不会直接使用任何 _archive 表。 相反,我们会查询 *_archive 表继承的原始表,然后 Postgres 会神奇地自动查找 *_archive 表。 下面我会进一步解释为什么会这样; 它与分区有关。

通过(软)-DELETE 将条目移动到存档表

由于 countriescountries_archive 这两个表在架构上看起来完全相似,我们可以 INSERT 到 使用触发函数轻松归档

  1. countries 表上发生 DELETE
  2. 或通过将 deleted_at 设置为 not 发生软删除时,可以 NULL 值。

触发器函数如下所示:

CREATE OR REPLACE FUNCTION archive_record()
RETURNS TRIGGER AS $
BEGIN
    -- When a soft-delete happens...
    IF (TG_OP = 'UPDATE' AND NEW.deleted_at IS NOT NULL) THEN
        EXECUTE format('DELETE FROM %I.%I WHERE id = $1', TG_TABLE_SCHEMA, TG_TABLE_NAME) USING OLD.id;
        RETURN OLD;
    END IF;
    -- When a hard-DELETE or a cascaded delete happens
    IF (TG_OP = 'DELETE') THEN
        -- Set the time when the deletion happens
        IF (OLD.deleted_at IS NULL) THEN
            OLD.deleted_at := now();
        END IF;
        EXECUTE format('INSERT INTO %I.%I SELECT $1.*'
                    , TG_TABLE_SCHEMA, TG_TABLE_NAME || '_archive')
        USING OLD;
    END IF;
    RETURN NULL;
END;
$ LANGUAGE plpgsql;

要使用触发器连接函数,我们可以编写:

CREATE TRIGGER soft_delete_countries
    AFTER
        -- this is what is triggered by GORM
        UPDATE OF deleted_at 
        -- this is what is triggered by a cascaded DELETE or a direct hard-DELETE
        OR DELETE
    ON countries
    FOR EACH ROW
    EXECUTE PROCEDURE archive_record();

结论

最初,postgres 中的继承功能被开发为 分区数据。 当您使用特定列或条件搜索分区数据时,Postgres 可以找出要搜索的分区,从而提高您的性能查询

除非另有说明,我们可以通过仅搜索存在的实体来从这种性能改进中受益。 存在的条目是那些 deleted_at IS NULL 为 true 的条目。 (请注意,如果 GORM 的模型结构中有 DeletedAt,GORM 会自动向每个查询添加 AND returned_at IS NULL。)

让我们看看 Postgres 是否已经知道如何利用通过运行 EXPLAIN 来进行分离:

EXPLAIN SELECT * FROM countries WHERE deleted_at IS NULL;
+-------------------------------------------------------------------------+
| QUERY PLAN                                                              |
|-------------------------------------------------------------------------|
| Append  (cost=0.00..21.30 rows=7 width=44)                              |
|   ->  Seq Scan on countries  (cost=0.00..0.00 rows=1 width=44)          |
|         Filter: (deleted_at IS NULL)                                    |
|   ->  Seq Scan on countries_archive  (cost=0.00..21.30 rows=6 width=44) |
|         Filter: (deleted_at IS NULL)                                    |
+-------------------------------------------------------------------------+

正如我们所看到的,Postgres 仍然搜索两个表:countriescountries_archive。 让我们看看在创建表时向 countries_archive 表添加检查约束时会发生什么:

CREATE TABLE countries_archive (
    CHECK (deleted_at IS NOT NULL)
) INHERITS (countries);

现在,Postgres 知道它可以在 deleted_at 时跳过 countries_archive预计为 NULL:

EXPLAIN SELECT * FROM countries WHERE deleted_at IS NULL;
+----------------------------------------------------------------+
| QUERY PLAN                                                     |
|----------------------------------------------------------------|
| Append  (cost=0.00..0.00 rows=1 width=44)                      |
|   ->  Seq Scan on countries  (cost=0.00..0.00 rows=1 width=44) |
|         Filter: (deleted_at IS NULL)                           |
+----------------------------------------------------------------+

请注意上述 EXPLAIN 中没有对 countries_archive 表进行顺序扫描。

好处和风险

好处

  1. 我们有定期的级联删除,可以让数据库确定删除内容的顺序。
  2. 同时,我们也会归档我们的数据。 每次软删除都
  3. 需要无需更改 Go 代码。 我们只需为每个要归档的表设置一个表和一个触发器。
  4. 每当我们认为我们不再希望这种带有触发器和级联软删除的行为我们可以轻松返回
  5. 未来对原始表进行的所有架构迁移也将应用于该表的_archive版本。 除了限制之外,这很好。

风险

  1. 假设您添加一个新表,该表引用另一个现有表,该表的外键具有 ON DELETE CASCADE。 如果现有表使用上面的 archive_record() 函数,则当现有表中的某些内容被软删除时,您的新表将收到硬 DELETE 消息。 如果您也将 archive_record() 用于新的从属表,那么这不是问题。 但你只需要记住它。

最后的想法

这里提出的方法并没有解决恢复单个行的问题。 另一方面,这种方法并没有让事情变得更难或更复杂。 它仍然悬而未决。

在我们的应用程序中,工作项的某些字段没有指定外键。 区域 ID 就是一个很好的例子。 这意味着当某个区域被删除时,关联的工作项不会自动被删除。 区域本身被删除有两种情况:

  1. 用户直接请求删除。
  2. 用户请求删除一个空间,然后由于该空间的外键约束,该区域被删除。

请注意,在第一个场景中,用户的请求先通过区域控制器代码,然后通过区域存储库代码。 我们有机会在任何这些层中修改所有引用不存在区域的工作项。 在第二种情况下,与该区域相关的所有事情都会发生并保留在数据库层上,因此我们没有机会修改工作项。 好消息是我们不必这样做。 每个工作项都引用一个空间,因此当空间消失时无论如何都会被删除。

适用于区域的内容也适用于迭代、标签和板栏。

如何申请到我们的数据库?

步骤

  1. 为继承原始表的所有表创建“*_archived”表。
  2. 使用上述 archive_record() 函数安装软删除触发器。
  3. 通过执行硬DELETE,将 deleted_at IS NOT NULL 的所有条目移至各自的 _archive 表,这将触发 archive_record()< /代码> 函数。

示例

这是一个完全工作的示例,其中我们演示了对两个表countries<的级联软删除/code> 和 大写。 我们展示了如何独立于选择的删除方法来归档记录。

CREATE TABLE countries (
    id int primary key,
    name text unique,
    deleted_at timestamp
);
CREATE TABLE countries_archive (
    CHECK ( deleted_at IS NOT NULL )
) INHERITS(countries);

CREATE TABLE capitals (
    id int primary key,
    name text,
    country_id int references countries(id) on delete cascade,
    deleted_at timestamp
);
CREATE TABLE capitals_archive (
    CHECK ( deleted_at IS NOT NULL )
) INHERITS(capitals);

CREATE OR REPLACE FUNCTION archive_record()
RETURNS TRIGGER AS $
BEGIN
    IF (TG_OP = 'UPDATE' AND NEW.deleted_at IS NOT NULL) THEN
        EXECUTE format('DELETE FROM %I.%I WHERE id = $1', TG_TABLE_SCHEMA, TG_TABLE_NAME) USING OLD.id;
        RETURN OLD;
    END IF;
    IF (TG_OP = 'DELETE') THEN
        IF (OLD.deleted_at IS NULL) THEN
            OLD.deleted_at := now();
        END IF;
        EXECUTE format('INSERT INTO %I.%I SELECT $1.*'
                    , TG_TABLE_SCHEMA, TG_TABLE_NAME || '_archive')
        USING OLD;
    END IF;
    RETURN NULL;
END;
$ LANGUAGE plpgsql;

CREATE TRIGGER soft_delete_countries
    AFTER
        UPDATE OF deleted_at 
        OR DELETE
    ON countries
    FOR EACH ROW
    EXECUTE PROCEDURE archive_record();
    
CREATE TRIGGER soft_delete_capitals
    AFTER
        UPDATE OF deleted_at 
        OR DELETE
    ON capitals
    FOR EACH ROW
    EXECUTE PROCEDURE archive_record();

INSERT INTO countries (id, name) VALUES (1, 'France');
INSERT INTO countries (id, name) VALUES (2, 'India');
INSERT INTO capitals VALUES (1, 'Paris', 1);
INSERT INTO capitals VALUES (2, 'Bengaluru', 2);

SELECT 'BEFORE countries' as "info", * FROM ONLY countries;
SELECT 'BEFORE countries_archive' as "info", * FROM countries_archive;
SELECT 'BEFORE capitals' as "info", * FROM ONLY capitals;
SELECT 'BEFORE capitals_archive' as "info", * FROM capitals_archive;

-- Delete one country via hard-DELETE and one via soft-delete
DELETE FROM countries WHERE id = 1;
UPDATE countries SET deleted_at = '2018-12-01' WHERE id = 2;

SELECT 'AFTER countries' as "info", * FROM ONLY countries;
SELECT 'AFTER countries_archive' as "info", * FROM countries_archive;
SELECT 'AFTER capitals' as "info", * FROM ONLY capitals;
SELECT 'AFTER capitals_archive' as "info", * FROM capitals_archive;

I've come up with a solution to cascading soft-deletes recently using Postgres 9.6 that makes use of inheritance to partition entries into deleted and non-deleted ones. Here's a copy of the document that I'm writing for our project:


Cascading soft-deletes

Abstract

In this document I describe our current approach to deal with deletion of objects in our Postgres database and I present the flaws of the current implementation. For example so far we don't have the ability to have cascading soft-deletes. Then I show a method that combines the strengths of Postgres' cascading hard-delete and an archiving approach that is easy to implement, maintain and that brings a performance boost in all search queries.

About soft-deletes in GORM

In the fabric8-services/fabric8-wit project which is written in Go we are using the an object oriented mapper for our database called GORM.

GORM offers a way to soft-delete database entries:

If model has DeletedAt field, it will get soft delete ability automatically! then it won’t be deleted from database permanently when call Delete, but only set field DeletedAt‘s value to current time.

Suppose you have a model definition, in other words a Go struct that looks like this:

// User is the Go model for a user entry in the database
type User struct {
    ID        int
    Name      string
DeletedAt *time.Time
}

And let's say you've loaded an existing user entry by its ID from the DB into an object u.

id := 123
u := User{}
db.Where("id=?", id).First(&u)

If you then go ahead and delete the object using GORM:

db.Delete(&u)

the DB entry will not be deleted using DELETE in SQL but the row will be updated and the deleted_at is set to the current time:

UPDATE users SET deleted_at="2018-10-12 11:24" WHERE id = 123;

Problems with soft-deletes in GORM - Inversion of dependency and no cascade

The above mentioned soft-delete is nice for archiving individual records but it can lead to very odd results for all records that depend on it. That is because soft-deletes by GORM don't cascade as a potential DELETE in SQL would do if a foreign key was modelled with ON DELETE CASCADE.

When you model a database you typcially design a table and then maybe another one that has a foreign key to the first one:

CREATE TABLE countries (
    name text PRIMARY KEY,
    deleted_at timestamp
);

CREATE TABLE cities (
    name text,
    country text REFERENCES countries(name) ON DELETE CASCADE,
    deleted_at timestamp
);

Here we've modeled a list of countries and cities that reference a particular country. When you DELETE a country record, all cities will be deleted as well. But since the table has a deleted_at column that is carried on in the Go struct for a country or city, the GORM mapper will only soft-delete the country and leave the belonging cities untouched.

Shifting responsibility from DB to user/developer

GORM thereby puts it in the hands of the developer to (soft-)delete all dependend cities. In other words, what previously was modeled as a relationship from cities to countries is now reversed as a relationship from countries to cities. That is because the user/developer is now responsible to (soft-)delete all cities belonging to a country when the country is deleted.

Proposal

Wouldn't it be great if we can have soft-deletes and all the benefits of a ON DELETE CASCADE?

It turns out that we can have it without much effort. Let's focus on a single table for now, namely the countries table.

An archive table

Suppose for a second, that we can have another table called countries_archive that has the excact same structure as the countries table. Also suppose that all future schema migrations that are done to countries are applied to the countries_archive table. The only exception is that unique constraints and foreign keys will not be applied to countries_archive.

I guess, this already sounds too good to be true, right? Well, we can create such a table using what's called Inheritenance in Postgres:

CREATE TABLE countries_archive () INHERITS (countries);

The resulting countries_archive table will is meant to store all records where deleted_at IS NOT NULL.

Note, that in our Go code we would never directly use any _archive table. Instead we would query for the original table from which *_archive table inherits and Postgres then magically looks into the *_archive table automatically. A bit further below I explain why that is; it has to do with partitioning.

Moving entries to the archive table on (soft)-DELETE

Since the two tables, countries and countries_archive look exactly alike schemawise we can INSERT into the archive very easily using a trigger function when

  1. a DELETE happens on the countries table
  2. or when a soft-delete is happening by setting deleted_at to a not NULL value.

The trigger function looks like this:

CREATE OR REPLACE FUNCTION archive_record()
RETURNS TRIGGER AS $
BEGIN
    -- When a soft-delete happens...
    IF (TG_OP = 'UPDATE' AND NEW.deleted_at IS NOT NULL) THEN
        EXECUTE format('DELETE FROM %I.%I WHERE id = $1', TG_TABLE_SCHEMA, TG_TABLE_NAME) USING OLD.id;
        RETURN OLD;
    END IF;
    -- When a hard-DELETE or a cascaded delete happens
    IF (TG_OP = 'DELETE') THEN
        -- Set the time when the deletion happens
        IF (OLD.deleted_at IS NULL) THEN
            OLD.deleted_at := now();
        END IF;
        EXECUTE format('INSERT INTO %I.%I SELECT $1.*'
                    , TG_TABLE_SCHEMA, TG_TABLE_NAME || '_archive')
        USING OLD;
    END IF;
    RETURN NULL;
END;
$ LANGUAGE plpgsql;

To wire-up the function with a trigger we can write:

CREATE TRIGGER soft_delete_countries
    AFTER
        -- this is what is triggered by GORM
        UPDATE OF deleted_at 
        -- this is what is triggered by a cascaded DELETE or a direct hard-DELETE
        OR DELETE
    ON countries
    FOR EACH ROW
    EXECUTE PROCEDURE archive_record();

Conclusions

Originally the inheritance functionality in postgres was developed to partition data. When you search your partitioned data using a specific column or condition, Postgres can find out which partition to search through and can thereby improve the performance of your query.

We can benefit from this performance improvement by only searching entities in existence, unless told otherwise. Entries in existence are those where deleted_at IS NULL holds true. (Notice, that GORM will automatically add an AND deleted_at IS NULL to every query if there's a DeletedAt in GORM's model struct.)

Let's see if Postgres already knows how to take advantage of our separation by running an EXPLAIN:

EXPLAIN SELECT * FROM countries WHERE deleted_at IS NULL;
+-------------------------------------------------------------------------+
| QUERY PLAN                                                              |
|-------------------------------------------------------------------------|
| Append  (cost=0.00..21.30 rows=7 width=44)                              |
|   ->  Seq Scan on countries  (cost=0.00..0.00 rows=1 width=44)          |
|         Filter: (deleted_at IS NULL)                                    |
|   ->  Seq Scan on countries_archive  (cost=0.00..21.30 rows=6 width=44) |
|         Filter: (deleted_at IS NULL)                                    |
+-------------------------------------------------------------------------+

As we can see, Postgres still searches both tables, countries and countries_archive. Let's see what happens when we add a check constraint to the countries_archive table upon table creation:

CREATE TABLE countries_archive (
    CHECK (deleted_at IS NOT NULL)
) INHERITS (countries);

Now, Postgres knows that it can skip countries_archive when deleted_at is expected to be NULL:

EXPLAIN SELECT * FROM countries WHERE deleted_at IS NULL;
+----------------------------------------------------------------+
| QUERY PLAN                                                     |
|----------------------------------------------------------------|
| Append  (cost=0.00..0.00 rows=1 width=44)                      |
|   ->  Seq Scan on countries  (cost=0.00..0.00 rows=1 width=44) |
|         Filter: (deleted_at IS NULL)                           |
+----------------------------------------------------------------+

Notice the absence of the sequential scan of the countries_archive table in the aforementioned EXPLAIN.

Benefits and Risks

Benefits

  1. We have regular cascaded deletes back and can let the DB figure out in which order to delete things.
  2. At the same time, we're archiving our data as well. Every soft-delete
  3. No Go code changes are required. We only have to setup a table and a trigger for each table that shall be archived.
  4. Whenever we figure that we don't want this behaviour with triggers and cascaded soft-delete anymore we can easily go back.
  5. All future schema migrations that are being made to the original table will be applied to the _archive version of that table as well. Except for constraints, which is good.

Risks

  1. Suppose you add a new table that references another existing table with a foreign key that has ON DELETE CASCADE. If the existing table uses the archive_record() function from above, your new table will receive hard DELETEs when something in the existing table is soft-deletes. This isn't a problem, if you use archive_record() for your new dependent table as well. But you just have to remember it.

Final thoughts

The approach presented here does not solve the problem of restoring individual rows. On the other hand, this approach doesn't make it harder or more complicated. It just remains unsolved.

In our application certain fields of a work item don't have a foreign key specified. A good example are the area IDs. That means when an area is DELETEd, an associated work item is not automatically DELETEd. There are two scenarios when an area is removed itself:

  1. A delete is directly requested from a user.
  2. A user requests to delete a space and then the area is removed due to its foreign key constraint on the space.

Notice that, in the first scenario the user's requests goes through the area controller code and then through the area repository code. We have a chance in any of those layers to modify all work items that would reference a non-existing area otherwise. In the second scenario everything related to the area happens and stays on the DB layer so we have no chance of moifying the work items. The good news is that we don't have to. Every work item references a space and will therefore be deleted anyways when the space goes away.

What applies to areas also applies to iterations, labels and board columns as well.

How to apply to our database?

Steps

  1. Create "*_archived" tables for all tables that inherit the original tables.
  2. Install a soft-delete trigger usinge the above archive_record() function.
  3. Move all entries where deleted_at IS NOT NULL to their respective _archive table by doing a hard DELETE which will trigger the archive_record() function.

Example

Here is a fully working example in which we demonstrated a cascaded soft-delete over two tables, countries and capitals. We show how records are being archived independently of the method that was chosen for the delete.

CREATE TABLE countries (
    id int primary key,
    name text unique,
    deleted_at timestamp
);
CREATE TABLE countries_archive (
    CHECK ( deleted_at IS NOT NULL )
) INHERITS(countries);

CREATE TABLE capitals (
    id int primary key,
    name text,
    country_id int references countries(id) on delete cascade,
    deleted_at timestamp
);
CREATE TABLE capitals_archive (
    CHECK ( deleted_at IS NOT NULL )
) INHERITS(capitals);

CREATE OR REPLACE FUNCTION archive_record()
RETURNS TRIGGER AS $
BEGIN
    IF (TG_OP = 'UPDATE' AND NEW.deleted_at IS NOT NULL) THEN
        EXECUTE format('DELETE FROM %I.%I WHERE id = $1', TG_TABLE_SCHEMA, TG_TABLE_NAME) USING OLD.id;
        RETURN OLD;
    END IF;
    IF (TG_OP = 'DELETE') THEN
        IF (OLD.deleted_at IS NULL) THEN
            OLD.deleted_at := now();
        END IF;
        EXECUTE format('INSERT INTO %I.%I SELECT $1.*'
                    , TG_TABLE_SCHEMA, TG_TABLE_NAME || '_archive')
        USING OLD;
    END IF;
    RETURN NULL;
END;
$ LANGUAGE plpgsql;

CREATE TRIGGER soft_delete_countries
    AFTER
        UPDATE OF deleted_at 
        OR DELETE
    ON countries
    FOR EACH ROW
    EXECUTE PROCEDURE archive_record();
    
CREATE TRIGGER soft_delete_capitals
    AFTER
        UPDATE OF deleted_at 
        OR DELETE
    ON capitals
    FOR EACH ROW
    EXECUTE PROCEDURE archive_record();

INSERT INTO countries (id, name) VALUES (1, 'France');
INSERT INTO countries (id, name) VALUES (2, 'India');
INSERT INTO capitals VALUES (1, 'Paris', 1);
INSERT INTO capitals VALUES (2, 'Bengaluru', 2);

SELECT 'BEFORE countries' as "info", * FROM ONLY countries;
SELECT 'BEFORE countries_archive' as "info", * FROM countries_archive;
SELECT 'BEFORE capitals' as "info", * FROM ONLY capitals;
SELECT 'BEFORE capitals_archive' as "info", * FROM capitals_archive;

-- Delete one country via hard-DELETE and one via soft-delete
DELETE FROM countries WHERE id = 1;
UPDATE countries SET deleted_at = '2018-12-01' WHERE id = 2;

SELECT 'AFTER countries' as "info", * FROM ONLY countries;
SELECT 'AFTER countries_archive' as "info", * FROM countries_archive;
SELECT 'AFTER capitals' as "info", * FROM ONLY capitals;
SELECT 'AFTER capitals_archive' as "info", * FROM capitals_archive;
吻泪 2024-07-20 06:47:06

我不想这么说,但触发器是专门为这种事情设计的。

(讨厌的部分是因为好的触发器很难编写,当然也无法调试)

I hate to say it but triggers are designed specifically for this kind of thing.

(The hate part is because good triggers are very hard to write and , of course , cannot be debugged)

我很坚强 2024-07-20 06:47:06

外键约束可以进行级联更新。 如果您在键和删除标志上链接表,那么当主表中的删除标志更改时,该更改将向下传播到详细表。 我还没有尝试过,但它应该有效。

Foreign key constraints can do cascade updates. If you linked your tables on both the key and the delete flag, then when the delete flag in the master table changed, that change would propagate down to the detail table. I haven't tried that, but it should work.

血之狂魔 2024-07-20 06:47:06

我认为软删除的一个好处通常是不是每个表都有软删除标志,因此需要级联的东西数量很少。 这些行在数据库中只是未使用,但不是孤立的 - 它们只是仅由已删除的行引用。

不过,就像所有事情一样,这取决于您的模型。

I think a benefit of the soft deletes is usually that not every table has a soft-delete flag, so the number of things needed to be cascaded is few. The rows are simply unused in the database, but not orphaned - they are simply only referred to by deleted rows.

Like everything, though, it depends on your model.

纵情客 2024-07-20 06:47:06

不确定您在谈论什么后端,但您可以拾取“删除标志”更改并使用触发器将更改向下级联。

Not sure what backend you're talking about, but you could pickup on your "delete flag" changing and cascade the change down using a trigger.

末蓝 2024-07-20 06:47:06

根据 @konrad-kleine 的回答此处,我们可以创建一个视图来过滤所有未删除的记录并将其呈现给用户。 我们仍然利用 postgres 的继承来尊重“活动”表的唯一约束(如果您删除具有唯一名称的记录,您希望能够再次添加该名称),并且我们不考虑恢复记录这里。 在“活动”表具有继承表没有的约束的情况下,恢复记录将需要处理数据漂移。

但还是有一些区别:

  1. 主表不必记录已删除的信息元数据
  2. 客户端不必按deleted_at 进行过滤(即我们不依赖于ORM 特定行为)。
  3. 我们不区分硬删除和软删除。 从客户端的角度来看,一旦记录被删除,它就会从他们的世界中删除,他们可以使用任何 SQL 客户端都支持的标准 DELETE 语义。

如果需要查看已删除的记录,您可以为此创建一个特殊的 postgres 视图。

SQLFiddle 用于以下代码

-- main table, will contain non deleted data
CREATE TABLE public.active_items (
    id SERIAL PRIMARY KEY,
    name TEXT
);

-- the archive table, inheriting columns and column types from the main table. Will only contain deleted data.
CREATE TABLE public.archived_items (
    deleted_at TIMESTAMP,
    deleted_by INTEGER
) INHERITS (public.active_items);


INSERT INTO public.active_items (name) VALUES
    ('Item 1'),
    ('Item 2');

-- Create am updateable view for non-deleted items since by `SELECT * FROM public.active_items;` will include
-- data from public.archived_items since it's an inherited table
CREATE OR REPLACE VIEW public.active_items_view AS
SELECT * FROM ONLY public.active_items;

SELECT * from public.active_items;
-- returns:
-- id,name
-- 1,Item 1
-- 2,Item 2


-- Update items through the view
UPDATE public.active_items_view
SET name = name || ' - updated';

SELECT * from public.active_items_view;
-- returns:
-- id,name
-- 1,Item 1 - updated
-- 2,Item 2 - updated


-- Insert a new item through the view
INSERT INTO public.active_items_view (name)
VALUES ('Item inserted via view');

SELECT * from public.active_items_view;
-- returns:
-- id,name
-- 1,Item 1 - updated
-- 2,Item 2 - updated
-- 3,Item inserted via view



-- placeholder function to track who deleted the record
CREATE OR REPLACE FUNCTION public.get_current_user_id()
RETURNS INTEGER AS $
BEGIN
    RETURN 1; -- Replace with logic to retrieve actual user id
END;
$ LANGUAGE plpgsql;

-- Create a function to move deleted items to the archive table
CREATE OR REPLACE FUNCTION public.move_to_archive()
RETURNS TRIGGER AS $
BEGIN
    -- Move deleted rows to the archive table
    INSERT INTO public.archived_items
    SELECT *,
           CURRENT_TIMESTAMP,
           public.get_current_user_id()
    FROM OLD_TABLE;

    RETURN NULL;
END;
$ LANGUAGE plpgsql;

-- Create a trigger to automatically archive deleted items
CREATE TRIGGER soft_delete_trigger
AFTER DELETE ON public.active_items
REFERENCING OLD TABLE AS OLD_TABLE
FOR EACH STATEMENT EXECUTE FUNCTION public.move_to_archive();

-- currently contains nothing
SELECT * FROM public.archived_items;

-- This will trigger the function above to move the data to the archive table
DELETE FROM public.active_items_view;


SELECT * FROM public.archived_items;
-- returns:
-- id, name,                   deleted_at,                  deleted_by,
-- 1,  Item 1 - updated,       2024-07-01 21:33:26.967218,  1
-- 2,  Item 2 - updated,       2024-07-01 21:33:26.967218,  1
-- 3,  Item inserted via view, 2024-07-01 21:33:26.967218,  1

-- contains nothing
SELECT * FROM public.active_items_view;

Playing off of @konrad-kleine's answer here, we can create a view that filters all non deleted records and present it to the user. We still take advantage of postgres' inheritance in order to respect the "active" table's unique constraints (if you delete a record with a unique name, you expect to be able to add that name again), and we do not account for restoring records here. Restoring records will need to handle data drift in cases where the "active" table has constraints that the inherited table does not.

There are a few differences though:

  1. The main table does not have to record the deleted information metadata
  2. The client does not have to filter by deleted_at (i.e. we are not dependent on ORM specific behavior).
  3. We don't differentiate between hard delete and soft delete. From the client's perspective, once a record is deleted, it is removed from their world, and they can use standard DELETE symantics that any SQL client will support.

If a deleted record needs to be viewed, you can create a special postgres view for that.

SQLFiddle for below code

-- main table, will contain non deleted data
CREATE TABLE public.active_items (
    id SERIAL PRIMARY KEY,
    name TEXT
);

-- the archive table, inheriting columns and column types from the main table. Will only contain deleted data.
CREATE TABLE public.archived_items (
    deleted_at TIMESTAMP,
    deleted_by INTEGER
) INHERITS (public.active_items);


INSERT INTO public.active_items (name) VALUES
    ('Item 1'),
    ('Item 2');

-- Create am updateable view for non-deleted items since by `SELECT * FROM public.active_items;` will include
-- data from public.archived_items since it's an inherited table
CREATE OR REPLACE VIEW public.active_items_view AS
SELECT * FROM ONLY public.active_items;

SELECT * from public.active_items;
-- returns:
-- id,name
-- 1,Item 1
-- 2,Item 2


-- Update items through the view
UPDATE public.active_items_view
SET name = name || ' - updated';

SELECT * from public.active_items_view;
-- returns:
-- id,name
-- 1,Item 1 - updated
-- 2,Item 2 - updated


-- Insert a new item through the view
INSERT INTO public.active_items_view (name)
VALUES ('Item inserted via view');

SELECT * from public.active_items_view;
-- returns:
-- id,name
-- 1,Item 1 - updated
-- 2,Item 2 - updated
-- 3,Item inserted via view



-- placeholder function to track who deleted the record
CREATE OR REPLACE FUNCTION public.get_current_user_id()
RETURNS INTEGER AS $
BEGIN
    RETURN 1; -- Replace with logic to retrieve actual user id
END;
$ LANGUAGE plpgsql;

-- Create a function to move deleted items to the archive table
CREATE OR REPLACE FUNCTION public.move_to_archive()
RETURNS TRIGGER AS $
BEGIN
    -- Move deleted rows to the archive table
    INSERT INTO public.archived_items
    SELECT *,
           CURRENT_TIMESTAMP,
           public.get_current_user_id()
    FROM OLD_TABLE;

    RETURN NULL;
END;
$ LANGUAGE plpgsql;

-- Create a trigger to automatically archive deleted items
CREATE TRIGGER soft_delete_trigger
AFTER DELETE ON public.active_items
REFERENCING OLD TABLE AS OLD_TABLE
FOR EACH STATEMENT EXECUTE FUNCTION public.move_to_archive();

-- currently contains nothing
SELECT * FROM public.archived_items;

-- This will trigger the function above to move the data to the archive table
DELETE FROM public.active_items_view;


SELECT * FROM public.archived_items;
-- returns:
-- id, name,                   deleted_at,                  deleted_by,
-- 1,  Item 1 - updated,       2024-07-01 21:33:26.967218,  1
-- 2,  Item 2 - updated,       2024-07-01 21:33:26.967218,  1
-- 3,  Item inserted via view, 2024-07-01 21:33:26.967218,  1

-- contains nothing
SELECT * FROM public.active_items_view;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文