级联软删除
SQL一直有一个很棒的特性:级联删除。 您提前计划好,当需要删除某些内容时,砰! 无需担心所有这些相关记录。
然而,现在删除任何东西几乎是一种禁忌。 您将其标记为已删除并停止显示它。 不幸的是,当存在依赖记录时,我无法找到可靠的解决方案来执行此操作。 我总是手动编码复杂的软删除网络。
有没有我完全错过的更好的解决方案?
SQL has always had a great feature: cascading deletes. You plan it in advance and when it's time to delete something, BAM! No need to worry about all those dependent records.
However, nowadays it's almost taboo to actually DELETE anything. You flag it as deleted and stop showing it. Unfortunately, I haven't been able to find a solid solution to doing this when there are dependent records. I've always manually coded the complicated web of soft deletes.
Is there a better solution out there that I have completely missed?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
我最近使用 Postgres 9.6 提出了一个级联软删除的解决方案,它利用继承将条目划分为已删除和未删除的条目。 这是我为我们的项目编写的文档的副本:
级联软删除
摘要
在本文档中,我描述了我们当前处理 Postgres 数据库中对象删除的方法,并介绍了当前实现的缺陷。 例如,到目前为止我们还没有能力进行级联软删除。 然后,我展示了一种方法,它结合了 Postgres 的级联硬删除的优势和易于实施、维护的归档方法,并且可以提高所有搜索查询的性能。
关于 GORM 中的软删除
在 fabric8-services/fabric8-wit 项目中编写在 Go 中,我们为数据库使用面向对象的映射器,称为 GORM。
GORM 提供了一种软删除数据库条目的方法:
假设您有一个模型定义,换句话说,一个 Go 结构体如下所示:
假设您已通过其
ID
从数据库将现有用户条目加载到对象u< /代码>。
如果您继续使用 GORM 删除对象:
在 SQL 中使用
DELETE
不会删除数据库条目,但该行将被更新,并且deleted_at
设置为当前时间:GORM 中的软删除问题 - 依赖性反转和无级联
上面提到的软删除对于归档单个记录很有用,但它可能会导致依赖于它的所有记录产生非常奇怪的结果。 这是因为 GORM 的软删除不会像 SQL 中潜在的
DELETE
那样级联(如果外键是使用 ON DELETE CASCADE 建模的)。当您对数据库进行建模时,通常会设计一个表,然后可能会设计另一个表,该表具有第一个表的外键:
在这里,我们建模了引用特定国家/地区的国家/地区和城市的列表。 当您
删除
国家/地区记录时,所有城市也将被删除。 但由于该表有一个在 Go 结构体中针对国家或城市进行的deleted_at
列,因此 GORM 映射器只会软删除该国家,而不会影响所属城市。将责任从 DB 转移到用户/开发者
GORM 从而将其交给开发者来(软)删除所有依赖城市。 换句话说,以前被建模为城市与国家的关系,现在被逆转为国家与城市的关系。 这是因为用户/开发人员现在负责在删除某个国家/地区时(软)删除属于该国家/地区的所有城市。
提案
如果我们能够实现软删除以及
ON DELETE CASCADE
的所有好处,那不是很好吗?事实证明,我们不需要付出太多的努力就能拥有它。 现在让我们关注一个表,即
countries
表。存档表
假设我们可以有另一个名为
countries_archive
的表,它与countries
表具有完全相同的相同结构。 另假设将来对countries
进行的所有架构迁移都应用于countries_archive
表。 唯一的例外是唯一约束和外键不会应用于countries_archive
。我想,这听起来好得令人难以置信,对吧? 好吧,我们可以使用所谓的 Inheritenance 创建这样一个表Postgres:
生成的
countries_archive
表将用于存储deleted_at IS NOT NULL
的所有记录。请注意,在我们的 Go 代码中,我们永远不会直接使用任何
_archive
表。 相反,我们会查询*_archive
表继承的原始表,然后 Postgres 会神奇地自动查找*_archive
表。 下面我会进一步解释为什么会这样; 它与分区有关。通过(软)-DELETE 将条目移动到存档表
由于
countries
和countries_archive
这两个表在架构上看起来完全相似,我们可以INSERT
到 使用触发函数轻松归档countries
表上发生DELETE
deleted_at
设置为 not 发生软删除时,可以NULL
值。触发器函数如下所示:
要使用触发器连接函数,我们可以编写:
结论
最初,postgres 中的继承功能被开发为 分区数据。 当您使用特定列或条件搜索分区数据时,Postgres 可以找出要搜索的分区,从而提高您的性能查询。
除非另有说明,我们可以通过仅搜索存在的实体来从这种性能改进中受益。 存在的条目是那些
deleted_at IS NULL
为 true 的条目。 (请注意,如果 GORM 的模型结构中有DeletedAt
,GORM 会自动向每个查询添加AND returned_at IS NULL
。)让我们看看 Postgres 是否已经知道如何利用通过运行
EXPLAIN
来进行分离:正如我们所看到的,Postgres 仍然搜索两个表:
countries
和countries_archive
。 让我们看看在创建表时向countries_archive
表添加检查约束时会发生什么:现在,Postgres 知道它可以在
deleted_at
时跳过countries_archive
预计为 NULL:请注意上述
EXPLAIN
中没有对countries_archive
表进行顺序扫描。好处和风险
好处
_archive
版本。 除了限制之外,这很好。风险
ON DELETE CASCADE
。 如果现有表使用上面的archive_record()
函数,则当现有表中的某些内容被软删除时,您的新表将收到硬DELETE
消息。 如果您也将archive_record()
用于新的从属表,那么这不是问题。 但你只需要记住它。最后的想法
这里提出的方法并没有解决恢复单个行的问题。 另一方面,这种方法并没有让事情变得更难或更复杂。 它仍然悬而未决。
在我们的应用程序中,工作项的某些字段没有指定外键。 区域 ID 就是一个很好的例子。 这意味着当某个区域被删除时,关联的工作项不会自动被删除。 区域本身被删除有两种情况:
请注意,在第一个场景中,用户的请求先通过区域控制器代码,然后通过区域存储库代码。 我们有机会在任何这些层中修改所有引用不存在区域的工作项。 在第二种情况下,与该区域相关的所有事情都会发生并保留在数据库层上,因此我们没有机会修改工作项。 好消息是我们不必这样做。 每个工作项都引用一个空间,因此当空间消失时无论如何都会被删除。
适用于区域的内容也适用于迭代、标签和板栏。
如何申请到我们的数据库?
步骤
archive_record()
函数安装软删除触发器。DELETE
,将deleted_at IS NOT NULL
的所有条目移至各自的_archive
表,这将触发archive_record()< /代码> 函数。
示例
这是一个完全工作的示例,其中我们演示了对两个表
countries<的级联软删除/code> 和
大写
。 我们展示了如何独立于选择的删除方法来归档记录。I've come up with a solution to cascading soft-deletes recently using Postgres 9.6 that makes use of inheritance to partition entries into deleted and non-deleted ones. Here's a copy of the document that I'm writing for our project:
Cascading soft-deletes
Abstract
In this document I describe our current approach to deal with deletion of objects in our Postgres database and I present the flaws of the current implementation. For example so far we don't have the ability to have cascading soft-deletes. Then I show a method that combines the strengths of Postgres' cascading hard-delete and an archiving approach that is easy to implement, maintain and that brings a performance boost in all search queries.
About soft-deletes in GORM
In the fabric8-services/fabric8-wit project which is written in Go we are using the an object oriented mapper for our database called GORM.
GORM offers a way to soft-delete database entries:
Suppose you have a model definition, in other words a Go struct that looks like this:
And let's say you've loaded an existing user entry by its
ID
from the DB into an objectu
.If you then go ahead and delete the object using GORM:
the DB entry will not be deleted using
DELETE
in SQL but the row will be updated and thedeleted_at
is set to the current time:Problems with soft-deletes in GORM - Inversion of dependency and no cascade
The above mentioned soft-delete is nice for archiving individual records but it can lead to very odd results for all records that depend on it. That is because soft-deletes by GORM don't cascade as a potential
DELETE
in SQL would do if a foreign key was modelled withON DELETE CASCADE
.When you model a database you typcially design a table and then maybe another one that has a foreign key to the first one:
Here we've modeled a list of countries and cities that reference a particular country. When you
DELETE
a country record, all cities will be deleted as well. But since the table has adeleted_at
column that is carried on in the Go struct for a country or city, the GORM mapper will only soft-delete the country and leave the belonging cities untouched.Shifting responsibility from DB to user/developer
GORM thereby puts it in the hands of the developer to (soft-)delete all dependend cities. In other words, what previously was modeled as a relationship from cities to countries is now reversed as a relationship from countries to cities. That is because the user/developer is now responsible to (soft-)delete all cities belonging to a country when the country is deleted.
Proposal
Wouldn't it be great if we can have soft-deletes and all the benefits of a
ON DELETE CASCADE
?It turns out that we can have it without much effort. Let's focus on a single table for now, namely the
countries
table.An archive table
Suppose for a second, that we can have another table called
countries_archive
that has the excact same structure as thecountries
table. Also suppose that all future schema migrations that are done tocountries
are applied to thecountries_archive
table. The only exception is that unique constraints and foreign keys will not be applied tocountries_archive
.I guess, this already sounds too good to be true, right? Well, we can create such a table using what's called Inheritenance in Postgres:
The resulting
countries_archive
table will is meant to store all records wheredeleted_at IS NOT NULL
.Note, that in our Go code we would never directly use any
_archive
table. Instead we would query for the original table from which*_archive
table inherits and Postgres then magically looks into the*_archive
table automatically. A bit further below I explain why that is; it has to do with partitioning.Moving entries to the archive table on (soft)-DELETE
Since the two tables,
countries
andcountries_archive
look exactly alike schemawise we canINSERT
into the archive very easily using a trigger function whenDELETE
happens on thecountries
tabledeleted_at
to a notNULL
value.The trigger function looks like this:
To wire-up the function with a trigger we can write:
Conclusions
Originally the inheritance functionality in postgres was developed to partition data. When you search your partitioned data using a specific column or condition, Postgres can find out which partition to search through and can thereby improve the performance of your query.
We can benefit from this performance improvement by only searching entities in existence, unless told otherwise. Entries in existence are those where
deleted_at IS NULL
holds true. (Notice, that GORM will automatically add anAND deleted_at IS NULL
to every query if there's aDeletedAt
in GORM's model struct.)Let's see if Postgres already knows how to take advantage of our separation by running an
EXPLAIN
:As we can see, Postgres still searches both tables,
countries
andcountries_archive
. Let's see what happens when we add a check constraint to thecountries_archive
table upon table creation:Now, Postgres knows that it can skip
countries_archive
whendeleted_at
is expected to beNULL
:Notice the absence of the sequential scan of the
countries_archive
table in the aforementionedEXPLAIN
.Benefits and Risks
Benefits
_archive
version of that table as well. Except for constraints, which is good.Risks
ON DELETE CASCADE
. If the existing table uses thearchive_record()
function from above, your new table will receive hardDELETE
s when something in the existing table is soft-deletes. This isn't a problem, if you usearchive_record()
for your new dependent table as well. But you just have to remember it.Final thoughts
The approach presented here does not solve the problem of restoring individual rows. On the other hand, this approach doesn't make it harder or more complicated. It just remains unsolved.
In our application certain fields of a work item don't have a foreign key specified. A good example are the area IDs. That means when an area is
DELETE
d, an associated work item is not automaticallyDELETE
d. There are two scenarios when an area is removed itself:Notice that, in the first scenario the user's requests goes through the area controller code and then through the area repository code. We have a chance in any of those layers to modify all work items that would reference a non-existing area otherwise. In the second scenario everything related to the area happens and stays on the DB layer so we have no chance of moifying the work items. The good news is that we don't have to. Every work item references a space and will therefore be deleted anyways when the space goes away.
What applies to areas also applies to iterations, labels and board columns as well.
How to apply to our database?
Steps
archive_record()
function.deleted_at IS NOT NULL
to their respective_archive
table by doing a hardDELETE
which will trigger thearchive_record()
function.Example
Here is a fully working example in which we demonstrated a cascaded soft-delete over two tables,
countries
andcapitals
. We show how records are being archived independently of the method that was chosen for the delete.我不想这么说,但触发器是专门为这种事情设计的。
(讨厌的部分是因为好的触发器很难编写,当然也无法调试)
I hate to say it but triggers are designed specifically for this kind of thing.
(The hate part is because good triggers are very hard to write and , of course , cannot be debugged)
外键约束可以进行级联更新。 如果您在键和删除标志上链接表,那么当主表中的删除标志更改时,该更改将向下传播到详细表。 我还没有尝试过,但它应该有效。
Foreign key constraints can do cascade updates. If you linked your tables on both the key and the delete flag, then when the delete flag in the master table changed, that change would propagate down to the detail table. I haven't tried that, but it should work.
我认为软删除的一个好处通常是不是每个表都有软删除标志,因此需要级联的东西数量很少。 这些行在数据库中只是未使用,但不是孤立的 - 它们只是仅由已删除的行引用。
不过,就像所有事情一样,这取决于您的模型。
I think a benefit of the soft deletes is usually that not every table has a soft-delete flag, so the number of things needed to be cascaded is few. The rows are simply unused in the database, but not orphaned - they are simply only referred to by deleted rows.
Like everything, though, it depends on your model.
不确定您在谈论什么后端,但您可以拾取“删除标志”更改并使用触发器将更改向下级联。
Not sure what backend you're talking about, but you could pickup on your "delete flag" changing and cascade the change down using a trigger.
根据 @konrad-kleine 的回答此处,我们可以创建一个视图来过滤所有未删除的记录并将其呈现给用户。 我们仍然利用 postgres 的继承来尊重“活动”表的唯一约束(如果您删除具有唯一名称的记录,您希望能够再次添加该名称),并且我们不考虑恢复记录这里。 在“活动”表具有继承表没有的约束的情况下,恢复记录将需要处理数据漂移。
但还是有一些区别:
如果需要查看已删除的记录,您可以为此创建一个特殊的 postgres 视图。
SQLFiddle 用于以下代码
Playing off of @konrad-kleine's answer here, we can create a view that filters all non deleted records and present it to the user. We still take advantage of postgres' inheritance in order to respect the "active" table's unique constraints (if you delete a record with a unique name, you expect to be able to add that name again), and we do not account for restoring records here. Restoring records will need to handle data drift in cases where the "active" table has constraints that the inherited table does not.
There are a few differences though:
If a deleted record needs to be viewed, you can create a special postgres view for that.
SQLFiddle for below code