物理与逻辑(硬与软)删除数据库记录?

发布于 2024-07-11 17:06:02 字数 1431 浏览 7 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(26

じ违心 2024-07-18 17:06:02

优点是您可以保留历史记录(有利于审核),并且不必担心通过数据库中引用您要删除的行的各种其他表进行级联删除。 缺点是您必须编写任何报告/显示方法来考虑该标志。

就它是否是一种常见做法而言 - 我会说是的,但与任何事情一样,是否使用它取决于您的业务需求。

编辑:想到另一个缺点 - 如果表上有唯一索引,则删除的记录仍将占用“一个”记录,因此您也必须围绕这种可能性进行编码(例如,具有唯一索引的用户表)用户名;删除的记录仍然会阻止已删除的用户用户名来获取新记录。解决此问题,您可以在已删除的用户名列上添加 GUID,但在这种情况下,这是一个非常棘手的解决方法。最好制定一条规则,一旦使用用户名,就永远无法更换。)

Advantages are that you keep the history (good for auditing) and you don't have to worry about cascading a delete through various other tables in the database that reference the row you are deleting. Disadvantage is that you have to code any reporting/display methods to take the flag into account.

As far as if it is a common practice - I would say yes, but as with anything whether you use it depends on your business needs.

EDIT: Thought of another disadvantange - If you have unique indexes on the table, deleted records will still take up the "one" record, so you have to code around that possibility too (for example, a User table that has a unique index on username; A deleted record would still block the deleted users username for new records. Working around this you could tack on a GUID to the deleted username column, but it's a very hacky workaround that I wouldn't recommend. Probably in that circumstance it would be better to just have a rule that once a username is used, it can never be replaced.)

醉态萌生 2024-07-18 17:06:02

逻辑删除是常见做法吗? 是的,我在很多地方都看到过这种情况。 他们安全吗? 这实际上取决于它们是否比您删除数据之前的数据安全性更低?

当我担任 Tech Lead 时,我要求我们的团队保留每一条数据,当时我知道我们将使用所有这些数据来构建各种 BI 应用程序,尽管当时我们不知道要求是什么是。 虽然从审计、故障排除和报告的角度来看这很好(这是一个用于 B2B 交易的电子商务/工具网站,如果有人使用了某个工具,我们希望记录下来,即使他们的帐户后来被关闭),它确实有几个缺点。

缺点包括(不包括已经提到的其他缺点):

  1. 保留所有数据对性能的影响,我们需要制定各种归档策略。 例如,应用程序的一个区域每周接近生成约 1Gb 的数据。
  2. 保存数据的成本确实会随着时间的推移而增加,尽管磁盘空间很便宜,但用于保存和管理在线和离线数 TB 数据的基础设施数量却很大。 它需要大量磁盘来实现冗余,并且需要人们花费时间来确保备份快速移动等。

当决定使用逻辑、物理删除或归档时,我会问自己以下问题:

  1. 这些数据是否可能需要重新插入到桌子。 例如,用户帐户适合此类别,因为您可以激活或停用用户帐户。 如果是这种情况,逻辑删除最有意义。
  2. 存储数据有任何内在价值吗? 如果是这样,将生成多少数据。 根据这一点,我要么采用逻辑删除,要么实施归档策略。 请记住,您始终可以归档逻辑删除的记录。

Are logical deletes common practice? Yes I have seen this in many places. Are they secure? That really depends are they any less secure then the data was before you deleted it?

When I was a Tech Lead, I demanded that our team keep every piece of data, I knew at the time that we would be using all that data to build various BI applications, although at the time we didn't know what the requirements would be. While this was good from the standpoint of auditing, troubleshooting, and reporting (This was an e-commerce / tools site for B2B transactions, and if someone used a tool, we wanted to record it even if their account was later turned off), it did have several downsides.

The downsides include (not including others already mentioned):

  1. Performance Implications of keeping all that data, We to develop various archiving strategies. For example one area of the application was getting close to generating around 1Gb of data a week.
  2. Cost of keeping the data does grow over time, while disk space is cheap, the amount of infrastructure to keep and manage terabytes of data both online and off line is a lot. It takes a lot of disk for redundancy, and people's time to ensure backups are moving swiftly etc.

When deciding to use logical, physical deletes, or archiving I would ask myself these questions:

  1. Is this data that might need to be re-inserted into the table. For example User Accounts fit this category as you might activate or deactivate a user account. If this is the case a logical delete makes the most sense.
  2. Is there any intrinsic value in storing the data? If so how much data will be generated. Depending on this I would either go with a logical delete, or implement an archiving strategy. Keep in mind you can always archive logically deleted records.
淡淡的优雅 2024-07-18 17:06:02

可能有点晚了,但我建议大家检查 Pinal Dave 的关于逻辑/软删除的博客文章

我就是一点也不喜欢这种设计【软删除】。 我坚信这样的架构:只有必要的数据应该放在单个表中,而无用的数据应该移动到存档表中。 我建议使用两个不同的表,而不是遵循 isDeleted 列:一个包含订单,另一个包含已删除的订单。 在这种情况下,您将必须维护这两个表,但实际上,维护起来非常容易。 当您将 UPDATE 语句写入 isDeleted 列时,将 INSERT INTO 写入另一个表并将其从原始表中删除。 如果是回滚的情况,则相反的顺序再写一次INSERT INTO和DELETE。 如果您担心事务失败,请将此代码包装在 TRANSACTION 中。

在上述情况下,较小的表格与较大的表格相比有哪些优势?

  • 较小的表易于维护
  • 索引重建操作速度更快
  • 将存档数据移动到另一个文件组将减少主文件组的负载(考虑到所有文件组都位于不同的系统上) - 这也将加快备份速度。
  • 由于规模较小,统计数据会经常更新,这会减少资源占用。
  • 索引的大小将会变小
  • 表的大小越小,表的性能将会提高。

It might be a little late but I suggest everyone to check Pinal Dave's blog post about logical/soft delete:

I just do not like this kind of design [soft delete] at all. I am firm believer of the architecture where only necessary data should be in single table and the useless data should be moved to an archived table. Instead of following the isDeleted column, I suggest the usage of two different tables: one with orders and another with deleted orders. In that case, you will have to maintain both the table, but in reality, it is very easy to maintain. When you write UPDATE statement to the isDeleted column, write INSERT INTO another table and DELETE it from original table. If the situation is of rollback, write another INSERT INTO and DELETE in reverse order. If you are worried about a failed transaction, wrap this code in TRANSACTION.

What are the advantages of the smaller table verses larger table in above described situations?

  • A smaller table is easy to maintain
  • Index Rebuild operations are much faster
  • Moving the archive data to another filegroup will reduce the load of primary filegroup (considering that all filegroups are on different system) – this will also speed up the backup as well.
  • Statistics will be frequently updated due to smaller size and this will be less resource intensive.
  • Size of the index will be smaller
  • Performance of the table will improve with a smaller table size.
一江春梦 2024-07-18 17:06:02

我是一名 NoSQL 开发人员,在我的上一份工作中,我处理的数据对于某人来说始终至关重要,如果它在创建的同一天被意外删除,我将无法在上次备份中找到它从昨天! 在这种情况下,软删除总是能挽救局面。

我使用时间戳进行软删除,记录文档被删除的日期:

IsDeleted = 20150310  //yyyyMMdd

每个星期日,一个进程都会在数据库上行走并检查 IsDeleted 字段。 如果当前日期与时间戳之间的差异大于N天,则该文档被硬删除。 考虑到该文档在某些备份中仍然可用,因此这样做是安全的。

编辑:这个NoSQL用例是关于在数据库中创建的大文档,每天有数十或数百个文档,但不是数千或数百万个。 一般来说,它们是包含工作流程过程的状态、数据和附件的文档。 这就是用户有可能删除重要文档的原因。 该用户可能是具有管理员权限的人,也可能是文档的所有者,仅举几例。

TL;DR 我的用例不是大数据。 在这种情况下,您将需要采用不同的方法。

I'm a NoSQL developer, and on my last job, I worked with data that was always critical for someone, and if it was deleted by accident in the same day that was created, I were not able to find it in the last backup from yesterday! In that situation, soft deletion always saved the day.

I did soft-deletion using timestamps, registering the date the document was deleted:

IsDeleted = 20150310  //yyyyMMdd

Every Sunday, a process walked on the database and checked the IsDeleted field. If the difference between the current date and the timestamp was greater than N days, the document was hard deleted. Considering the document still be available on some backup, it was safe to do it.

EDIT: This NoSQL use case is about big documents created in the database, tens or hundreds of them every day, but not thousands or millions. By general, they were documents with the status, data and attachments of workflow processes. That was the reason why there was the possibility of a user deletes an important document. This user could be someone with Admin privileges, or maybe the document's owner, just to name a few.

TL;DR My use case was not Big Data. In that case, you will need a different approach.

一杯敬自由 2024-07-18 17:06:02

我使用的一种模式是创建一个镜像表并在主表上附加一个触发器,因此所有删除(以及需要时的更新)都记录在镜像表中。

这允许您“重建”已删除/更改的记录,并且您仍然可以在主表中进行硬删除并保持其“干净” - 它还允许创建“撤消”功能,并且您还可以记录日期、时间,以及在镜子表中执行该操作的用户(在政治迫害情况下非常有用)。

另一个优点是,在查询主表时不会意外地包含已删除的记录,除非您故意麻烦地包含镜像表中的记录(您可能希望显示活动记录和已删除记录)。

另一个优点是镜像表可以独立清除,因为它不应该有任何实际的外键引用,这使得与从使用软删除但仍然与其他表有引用连接的主表中清除相比,这是一个相对简单的操作。

还有什么其他优势? - 如果您有一群编码人员致力于该项目,以混合技能和对细节级别的关注来读取数据库,那就太好了,您不必熬夜希望其中一个没有忘记不包括已删除的记录(哈哈,不包括已删除的记录= True),这会导致诸如夸大客户可用现金头寸之类的事情,然后他们用这些现金头寸购买一些股票(即,就像在交易中一样)系统),当您使用交易系统时,您会很快发现强大解决方案的价值,即使它们可能有更多的初始“开销”。

例外:
- 作为指导,对“参考”数据(例如用户、类别等)使用软删除,对“事实”类型数据(即交易历史)的镜像表使用硬删除。

One pattern I have used is to create a mirror table and attach a trigger on the primary table, so all deletes (and updates if desired) are recorded in the mirror table.

This allows you to "reconstruct" deleted/changed records, and you can still hard delete in the primary table and keep it "clean" - it also allows the creation of an "undo" function, and you can also record the date, time, and user who did the action in the mirror table (invaluable in witch hunt situations).

The other advantage is there is no chance of accidentally including deleted records when querying off the primary unless you deliberately go to the trouble of including records from the mirror table (you may want to show live and deleted records).

Another advantage is that the mirror table can be independently purged, as it should not have any actual foreign key references, making this a relatively simple operation in comparison to purging from a primary table that uses soft deletes but still has referential connections to other tables.

What other advantages? - great if you have a bunch of coders working on the project, doing reads on the database with mixed skill and attention to detail levels, you don't have to stay up nights hoping that one of them didn’t forget to not include deleted records (lol, Not Include Deleted Records = True), which results in things like overstating say the clients available cash position which they then go buy some shares with (i.e., as in a trading system), when you work with trading systems, you will find out very quickly the value of robust solutions, even though they may have a little bit more initial "overhead".

Exceptions:
- as a guide, use soft deletes for "reference" data such as user, category, etc, and hard deletes to a mirror table for "fact" type data, i.e., transaction history.

为你鎻心 2024-07-18 17:06:02

我曾经做过软删除,只是为了保留旧记录。 我意识到用户并不像我想象的那样频繁地查看旧记录。 如果用户想查看旧记录,他们可以从存档或审计表中查看,对吧? 那么,软删除有什么好处呢? 它只会导致更复杂的查询语句等。

在我决定不再软删除之前,以下是我已经实现的内容:

  1. 实现审计,以记录所有活动(添加、编辑、删除)。 确保没有外键链接到审计,并确保此表是安全的,除了管理员之外没有人可以删除。

  2. 确定哪些表被视为“事务表”,这些表很可能会保留很长时间,并且很可能用户可能希望查看过去的记录或报告。 例如; 购买交易。 该表不应只保留主表的 id(例如 dept-id),还应保留附加信息,例如作为参考的名称(例如 dept-name),或任何其他用于报告的必要字段。

  3. 实现主表的“活动/非活动”或“启用/禁用”或“隐藏/显示”记录。 因此,用户可以禁用/停用主记录,而不是删除记录。 这样更安全。

只是我的两分钱意见。

I used to do soft-delete, just to keep old records. I realized that users don't bother to view old records as often as I thought. If users want to view old records, they can just view from archive or audit table, right? So, what's the advantage of soft-delete? It only leads to more complex query statement, etc.

Following are the things i've implemented, before I decided to not-soft-delete anymore:

  1. implement audit, to record all activities (add,edit,delete). Ensure that there's no foreign key linked to audit, and ensure this table is secured and nobody can delete except administrators.

  2. identify which tables are considered "transactional table", which very likely that it will be kept for long time, and very likely user may want to view the past records or reports. For example; purchase transaction. This table should not just keep the id of master table (such as dept-id), but also keep the additional info such as the name as reference (such as dept-name), or any other necessary fields for reporting.

  3. Implement "active/inactive" or "enable/disable" or "hide/show" record of master table. So, instead of deleting record, the user can disable/inactive the master record. It is much safer this way.

Just my two cents opinion.

白馒头 2024-07-18 17:06:02

我非常喜欢逻辑删除,特别是对于业务线应用程序或用户帐户的上下文。 我的原因很简单:很多时候我不希望用户能够再使用系统(因此帐户被标记为已删除),但如果我们删除用户,我们就会丢失他们的所有工作等。

另一种常见情况是用户可能会在删除一段时间后重新创建。 对于用户来说,让所有数据保持删除前的状态,而不是重新创建数据,这是一种更好的体验。

我通常认为删除用户更多的是无限期地“暂停”他们。 你永远不知道他们什么时候需要回来。

I'm a big fan of the logical delete, especially for a Line of Business application, or in the context of user accounts. My reasons are simple: often times I don't want a user to be able to use the system anymore (so the account get's marked as deleted), but if we deleted the user, we'd lose all their work and such.

Another common scenario is that the users might get re-created a while after having been delete. It's a much nicer experience for the user to have all their data present as it was before they were deleted, rather than have to re-create it.

I usually think of deleting users more as "suspending" them indefinitely. You never know when they'll legitimately need to be back.

☆獨立☆ 2024-07-18 17:06:02

我几乎总是软删除,原因如下:

  • 如果客户要求您这样做,您可以恢复已删除的数据。 通过软删除让客户更加满意。 从备份中恢复特定数据非常复杂,
  • 在任何地方检查 isdeleted 都不是问题,无论如何,您都必须检查 userid(如果数据库包含来自多个用户的数据)。 您可以通过代码强制执行检查,方法是将这两项检查放在单独的函数(或使用视图)的
  • 优雅删除上。 处理已删除内容的用户或进程将继续“看到”它,直到他们进行下一次刷新。 如果一个进程正在处理一些突然删除的数据
  • 同步,这是一个非常理想的功能:如果您需要设计数据库和移动应用程序之间的同步机制,您会发现软删除更容易实现

I almost always soft delete and here's why:

  • you can restore deleted data if a customer asks you to do so. More happy customers with soft deletes. Restoring specific data from backups is complex
  • checking for isdeleted everywhere is not an issue, you have to check for userid anyway (if the database contains data from multiple users). You can enforce the check by code, by placing those two checks on a separate function (or use views)
  • graceful delete. Users or processes dealing with deleted content will continue to "see" it until they hit the next refresh. This is a very desirable feature if a process is processing some data which is suddenly deleted
  • synchronization: if you need to design a synchronization mechanism between a database and mobile apps, you'll find soft deletes much easier to implement
傲娇萝莉攻 2024-07-18 17:06:02

我通常使用逻辑删除 - 我发现当您还间歇性地将“已删除”数据归档到归档表(如果需要可以搜索)时,它们效果很好,因此不会影响应用程序的性能。

它运作良好,因为如果您接受审核,您仍然拥有数据。 如果您物理删除它,它就消失了

I commonly use logical deletions - I find they work well when you also intermittently archive off the 'deleted' data to an archived table (which can be searched if needed) thus having no chance of affecting the performance of the application.

It works well because you still have the data if you're ever audited. If you delete it physically, it's gone!

旧故 2024-07-18 17:06:02

回复:“这安全吗?” - 这取决于你的意思。

如果您的意思是通过执行物理删除,您将阻止任何人找到已删除的数据,那么是的,这或多或少是正确的; 物理删除需要擦除的敏感数据会更安全,因为这意味着它会从数据库中永久消失。 (但是,请意识到可能存在相关数据的其他副本,例如备份、事务日志或传输中的记录版本,例如数据包嗅探器 - 仅仅因为您从数据库中删除并不意味着保证它没有保存在其他地方。)

如果您的意思是通过执行逻辑删除,您的数据会更安全,因为您永远不会丢失任何数据,这也是事实。 这对于审计场景很有好处; 我倾向于这样设计,因为它承认一个基本事实,即一旦生成数据,它就永远不会真正消失(特别是如果它具有被互联网搜索缓存的能力)引擎)。 当然,真正的审计场景不仅要求删除符合逻辑,还要求记录更新,以及更改的时间和进行更改的参与者。

如果您的意思是数据不会落入任何不应该看到它的人手中,那么这完全取决于您的应用程序及其安全结构。 在这方面,逻辑删除并不比数据库中的其他任何内容更安全或更不安全。

Re: "Is this secure?" - that depends on what you mean.

If you mean that by doing physical delete, you'll prevent anyone from ever finding the deleted data, then yes, that's more or less true; you're safer in physically deleting the sensitive data that needs to be erased, because that means it's permanently gone from the database. (However, realize that there may be other copies of the data in question, such as in a backup, or the transaction log, or a recorded version from in transit, e.g. a packet sniffer - just because you delete from your database doesn't guarantee it wasn't saved somewhere else.)

If you mean that by doing logical delete, your data is more secure because you'll never lose any data, that's also true. This is good for audit scenarios; I tend to design this way because it admits the basic fact that once data is generated, it'll never really go away (especially if it ever had the capability of being, say, cached by an internet search engine). Of course, a real audit scenario requires that not only are deletes logical, but that updates are also logged, along with the time of the change and the actor who made the change.

If you mean that the data won't fall into the hands of anyone who isn't supposed to see it, then that's totally up to your application and its security structure. In that respect, logical delete is no more or less secure than anything else in your database.

情感失落者 2024-07-18 17:06:02

逻辑删除对于引用完整性来说是困难的。

当表数据存在时间方面(从日期到日期有效)时,这是正确的想法。

否则将数据移动到审核表并删除记录。

从好的方面来说:

这是更简单的回滚方法(如果可能的话)。

很容易看出特定时间点的状态。

Logical deletions if are hard on referential integrity.

It is the right think to do when there is a temporal aspect of the table data (are valid FROM_DATE - TO_DATE).

Otherwise move the data to an Auditing Table and delete the record.

On the plus side:

It is the easier way to rollback (if at all possible).

It is easy to see what was the state at a specific point in time.

梦归所梦 2024-07-18 17:06:02

我强烈不同意逻辑删除,因为你会遇到很多错误。

首先是查询,每个查询都必须注意IsDeleted字段,复杂查询出错的可能性会更高。

其次是性能:想象一个有 100000 个记录的表,只有 3 个活动记录,现在将这个数字乘以数据库的表; 另一个性能问题是新记录与旧记录(已删除记录)可能发生冲突。

我看到的唯一优点是记录的历史记录,但是还有其他方法可以实现此结果,例如您可以创建一个可以保存信息的日志记录表:TableName,OldValues,NewValues,Date,User ,[..] 其中 *Values 可以是 varchar 并以这种形式写入详细信息 fieldname : value ; [..] 或将信息存储为 xml

所有这些都可以通过代码或触发器来实现,但您只有一个包含所有历史记录的表。
另一个选项是查看指定的数据库引擎是否本机支持跟踪更改,例如在 SQL Server 数据库上有 SQL Track Data Change。

I strongly disagree with logical delete because you are exposed to many errors.

First of all queries, each query must take care the IsDeleted field and the possibility of error becomes higher with complex queries.

Second the performance: imagine a table with 100000 recs with only 3 active, now multiply this number for the tables of your database; another performance problem is a possible conflict with new records with old (deleted records).

The only advantage I see is the history of records, but there are other methods to achieve this result, for example you can create a logging table where you can save info: TableName,OldValues,NewValues,Date,User,[..] where *Values ​​can be varchar and write the details in this form fieldname : value; [..] or store the info as xml.

All this can be achieved via code or Triggers but you are only ONE table with all your history.
Another options is to see if the specified database engine are native support for tracking change, for example on SQL Server database there are SQL Track Data Change.

筱武穆 2024-07-18 17:06:02

除了系统设计之外,还有一些需求需要满足。 记录保留的法律或法规要求是什么? 根据行的相关内容,法律可能要求数据在“暂停”后保留一段时间。

另一方面,要求可能是一旦记录被“删除”,它就是真正的、不可撤销的删除。 在做出决定之前,请与利益相关者交谈。

There are requirements beyond system design which need to be answered. What is the legal or statutory requirement in the record retention? Depending on what the rows are related to, there may be a legal requirement that the data be kept for a certain period of time after it is 'suspended'.

On the other hand, the requirement may be that once the record is 'deleted', it is truly and irrevocably deleted. Before you make a decision, talk to your stakeholders.

夏天碎花小短裙 2024-07-18 17:06:02

如果您想保留某些内容的历史记录(例如 @Jon Dewees 提到的用户帐户),这是相当标准的。 如果用户很有可能要求取消删除,这当然是个好主意。

如果您担心从查询中过滤掉已删除记录的逻辑会变得混乱并且只会使查询变得复杂,那么您可以构建为您执行过滤的视图并针对该视图使用查询。 它将防止这些记录在报告解决方案等中泄露。

It's fairly standard in cases where you'd like to keep a history of something (e.g. user accounts as @Jon Dewees mentions). And it's certainly a great idea if there's a strong chance of users asking for un-deletions.

If you're concerned about the logic of filtering out the deleted records from your queries getting messy and just complicating your queries, you can just build views that do the filtering for you and use queries against that. It'll prevent leakage of these records in reporting solutions and such.

你又不是我 2024-07-18 17:06:02

依赖于同步的移动应用程序可能会强制使用逻辑删除而不是物理删除:服务器必须能够向客户端指示记录已被(标记为)删除,如果记录被物理删除,则这可能是不可能的。

Mobile apps that depend on synchronisation might impose the use of logical rather than physical delete: a server must be able to indicate to the client that a record has been (marked as) deleted, and this might not be possible if records were physically deleted.

泪痕残 2024-07-18 17:06:02

我只是想扩展提到的独特约束问题。

假设我有一个包含两列的表:idmy_column。 为了支持软删除,我需要将表定义更新为:

create table mytable (
  id serial primary key,
  my_column varchar unique not null,
  deleted_at datetime
)

但是如果一行被软删除,我希望忽略 my_column 约束,因为已删除的数据不应干扰未删除的数据。 我的原始模型将无法工作。

我需要将我的数据定义更新为:

create table mytable (
  id serial primary key,
  my_column varchar not null,
  my_column_repetitions integer not null default 0,
  deleted_at datetime,
  unique (my_column, my_column_repetitions),
  check (deleted_at is not null and my_column_repetitions > 0 or deleted_at is null and my_column_repetitions = 0)
)

并应用此逻辑:当一行是当前行(即未删除)时,my_column_repetitions 应保留默认值0,并且当行被软删除,其 my_column_repetitions 需要更新为 (软删除行的最大重复次数)+ 1。

后一个逻辑必须通过编程方式实现在我的应用程序代码中触发或处理,并且没有我可以设置的检查。

对每个独特的列重复此操作!

我认为这个解决方案确实很hacky,并且倾向于使用单独的归档表来存储已删除的行。

I just wanted to expand on the mentioned unique constraint problem.

Suppose I have a table with two columns: id and my_column. To support soft-deletes I need to update my table definition to this:

create table mytable (
  id serial primary key,
  my_column varchar unique not null,
  deleted_at datetime
)

But if a row is soft-deleted, I want my_column constraint to be ignored, because deleted data should not interfere with non-deleted data. My original model will not work.

I would need to update my data definition to this:

create table mytable (
  id serial primary key,
  my_column varchar not null,
  my_column_repetitions integer not null default 0,
  deleted_at datetime,
  unique (my_column, my_column_repetitions),
  check (deleted_at is not null and my_column_repetitions > 0 or deleted_at is null and my_column_repetitions = 0)
)

And apply this logic: when a row is current, i.e. not deleted, my_column_repetitions should hold the default value 0 and when the row is soft-deleted its my_column_repetitions needs to be updated to (max. number of repetitions on soft-deleted rows) + 1.

The latter logic must be implemented programmatically with a trigger or handled in my application code and there is no check that I could set.

Repeat this is for every unique column!

I think this solution is really hacky and would favor a separate archive table to store deleted rows.

寂寞花火° 2024-07-18 17:06:02

他们不让数据库按照应有的方式执行,从而使级联功能等功能变得无用。

对于插入等简单的事情,在重新插入的情况下,其背后的代码会加倍。

您不能只是简单地插入,而是必须检查是否存在,如果之前不存在则插入,或者如果存在则更新删除标志,同时还将所有其他列更新为新值。 这被视为对数据库事务日志的更新,而不是导致审核日志不准确的新插入。

它们会导致性能问题,因为表中充满了冗余数据。 它对索引尤其是唯一性造成了严重破坏。

我不太喜欢逻辑删除。

They don't let the database perform as it should rendering such things as the cascade functionality useless.

For simple things such as inserts, in the case of re-inserting, then the code behind it doubles.

You can't just simply insert, instead you have to check for an existence and insert if it doesn't exist before or update the deletion flag if it does whilst also updating all other columns to the new values. This is seen as an update to the database transaction log and not a fresh insert causing inaccurate audit logs.

They cause performance issues because tables are getting glogged with redundant data. It plays havock with indexing especially with uniqueness.

I'm not a big fan of logical deletes.

风尘浪孓 2024-07-18 17:06:02

为了回复 Tohid 的评论,我们遇到了同样的问题,我们想要保留记录的历史记录,而且我们不确定是否需要 is_deleted 列。

我正在谈论我们的 python 实现以及我们遇到的类似用例。

我们遇到了 https://github.com/kvesteri/sqlalchemy-continuum 这是一个简单的方法获取相应表的版本控制表。 最少的代码行数并捕获添加、删除和更新的历史记录。

这不仅仅是 is_deleted 列。 您始终可以反向引用版本表来检查此条目发生了什么。 条目是否被删除、更新或添加。

这样我们就根本不需要 is_deleted 列,而且我们的删除函数也非常简单。 这样我们也不需要记住在任何 api 中标记 is_deleted=False

To reply to Tohid's comment, we faced same problem where we wanted to persist history of records and also we were not sure whether we wanted is_deleted column or not.

I am talking about our python implementation and a similar use-case we hit.

We encountered https://github.com/kvesteri/sqlalchemy-continuum which is an easy way to get versioning table for your corresponding table. Minimum lines of code and captures history for add, delete and update.

This serves more than just is_deleted column. You can always backref version table to check what happened with this entry. Whether entry got deleted, updated or added.

This way we didn't need to have is_deleted column at all and our delete function was pretty trivial. This way we also don't need to remember to mark is_deleted=False in any of our api's.

七堇年 2024-07-18 17:06:02

软删除是一种编程实践,当数据更相关时,大多数应用程序都会遵循这种编程实践。 考虑一个金融应用程序的情况,其中最终用户错误的删除可能是致命的。
当软删除变得相关时就是这种情况。 在软删除中,用户实际上并没有从记录中删除数据,而是将其标记为 IsDeleted 为 true(按照正常惯例)。

在 EF 6.x 或 EF 7 及更高版本中,Softdelete 作为属性添加,但我们现在必须暂时创建一个自定义属性。

我强烈推荐在数据库设计中使用 SoftDelete,它是编程实践的一个很好的约定。

Soft Delete is a programming practice that being followed in most of the application when data is more relevant. Consider a case of financial application where a delete by the mistake of the end user can be fatal.
That is the case when soft delete becomes relevant. In soft delete the user is not actually deleting the data from the record instead its being flagged as IsDeleted to true (By normal convention).

In EF 6.x or EF 7 onward Softdelete is Added as an attribute but we have to create a custom attribute for the time being now.

I strongly recommend SoftDelete In a database design and its a good convention for the programming practice.

踏雪无痕 2024-07-18 17:06:02

大多数时候使用软删除是因为您不想公开某些数据,但由于历史原因必须保留它(产品可能会停产,因此您不希望使用它进行任何新事务,但您仍然需要使用销售交易的历史记录)。 顺便说一句,有些人复制销售交易数据中的产品信息值,而不是引用产品来处理此问题。

事实上,它看起来更像是对可见/隐藏或活动/非活动功能的重新措辞。 因为这就是商业世界中“删除”的含义。 我想说终结者可能会删除人,但老板只是解雇他们。

这种做法是非常常见的模式,并且由于多种原因被许多应用程序使用。 因为这不是实现这一目标的唯一方法,所以会有成千上万的人说这很棒或胡说八道,而且两者都有很好的论据。

从安全的角度来看,SoftDelete不会取代审计的工作,也不会取代备份的工作。 如果您害怕“两个备份案例之间的插入/删除”,您应该阅读有关完整或批量恢复模型的内容。 我承认 SoftDelete 可以使恢复过程变得更加简单。

由您来了解您的要求。

Most of time softdeleting is used because you don't want to expose some data but you have to keep it for historical reasons (A product could become discontinued, so you don't want any new transaction with it but you still need to work with the history of sale transaction). By the way, some are copying the product information value in the sale transaction data instead of making a reference to the product to handle this.

In fact it looks more like a rewording for a visible/hidden or active/inactive feature. Because that's the meaning of "delete" in business world. I'd like to say that Terminators may delete people but boss just fire them.

This practice is pretty common pattern and used by a lot of application for a lot of reasons. As It's not the only way to achieve this, so you will have thousand of people saying that's great or bullshit and both have pretty good arguments.

From a point of view of security, SoftDelete won't replace the job of Audit and it won't replace the job of backup too. If you are afraid of "the insert/delete between two backup case", you should read about Full or Bulk recovery Models. I admit that SoftDelete could make the recovery process more trivial.

Up to you to know your requirement.

你没皮卡萌 2024-07-18 17:06:02

为了提供替代方案,我们让用户通过 MobiLink 使用远程设备更新。 如果我们删除服务器数据库中的记录,这些记录永远不会在客户端数据库中被标记为已删除。

所以我们两者都做。 我们与客户合作,确定他们希望恢复数据的时间。 例如,通常客户和产品在我们的客户说应该删除之前都是活跃的,但销售历史记录仅保留 13 个月,然后自动删除。 客户可能希望将已删除的客户和产品保留两个月,但将历史记录保留六个月。

因此,我们在一夜之间运行一个脚本,根据这些参数标记逻辑删除的内容,然后两/六个月后,今天标记为逻辑删除的任何内容都将被硬删除。

我们更关心的是在内存有限的客户端设备(例如智能手机)上拥有庞大的数据库,而不是数据安全。 一个连续四年每周两次订购 200 种产品的客户将拥有超过 81,000 行历史记录,其中 75% 的客户并不关心他是否看到。

To give an alternative, we have users using remote devices updating via MobiLink. If we delete records in the server database, those records never get marked deleted in the client databases.

So we do both. We work with our clients to determine how long they wish to be able to recover data. For example, generally customers and products are active until our client say they should be deleted, but history of sales is only retained for 13 months and then deletes automatically. The client may want to keep deleted customers and products for two months but retain history for six months.

So we run a script overnight that marks things logically deleted according to these parameters and then two/six months later, anything marked logically deleted today will be hard deleted.

We're less about data security than about having enormous databases on a client device with limited memory, such as a smartphone. A client who orders 200 products twice a week for four years will have over 81,000 lines of history, of which 75% the client doesn't care if he sees.

一梦浮鱼 2024-07-18 17:06:02

这完全取决于系统及其数据的用例。

例如,如果您正在谈论政府监管的系统(例如,制药公司的系统被视为质量体系的一部分,并且必须遵循 FDA 的电子记录指南),那么您最好不要进行硬删除! FDA 的审计员可以进来并询问系统中与产品编号 ABC-123 相关的所有记录,并且所有数据最好都可用。 如果您的业务流程所有者表示系统不应允许任何人在未来的新记录上使用产品编号 ABC-123,请使用软删除方法使其在系统内“不活动”,同时仍保留历史数据。

然而,也许您的系统及其数据有一个用例,例如“跟踪北极的天气”。 也许您每小时读取一次温度读数,并在一天结束时汇总每日平均值。 也许每小时数据在聚合后将不再使用,并且您在创建聚合后很难删除每小时读数。 (这是一个虚构的、微不足道的例子。)

关键是,这一切都取决于系统及其数据的用例,而不是纯粹从技术角度做出的决定。

It all depends on the use case of the system and its data.

For example, if you are talking about a government regulated system (e.g. a system at a pharmaceutical company that is considered a part of the quality system and must follow FDA guidelines for electronic records), then you darned well better not do hard deletes! An auditor from the FDA can come in and ask for all records in the system relating to product number ABC-123, and all data better be available. If your business process owner says the system shouldn't allow anyone to use product number ABC-123 on new records going forward, use the soft-delete method instead to make it "inactive" within the system, while still preserving historical data.

However, maybe your system and its data has a use case such as "tracking the weather at the North Pole". Maybe you take temperature readings once every hour, and at the end of the day aggregate a daily average. Maybe the hourly data will no longer ever be used after aggregation, and you'd hard-delete the hourly readings after creating the aggregate. (This is a made-up, trivial example.)

The point is, it all depends on the use case of the system and its data, and not a decision to be made purely from a technological standpoint.

伪装你 2024-07-18 17:06:02

出色地! 大家都说了,要看情况。

如果您在 UserName 或 EmailID 之类的列上有索引 - 并且您从不希望再次使用相同的 UserName 或 EmailID; 你可以进行软删除。

也就是说,请务必检查您的 SELECT 操作是否使用主键。 如果您的 SELECT 语句使用主键,则使用 WHERE 子句添加标志不会产生太大影响。 让我们举个例子(伪):

Table Users (UserID [主键], EmailID, IsDeleted)

SELECT * FROM Users where UserID = 123456 and IsDeleted = 0

由于 UserID 列具有主键,因此该查询不会对性能产生任何影响。 最初,它会根据 PK 扫描表,然后执行下一个条件。

软删除根本不起作用的情况:

几乎所有网站的注册都以EmailID作为您的唯一标识。 我们很清楚,一个EmailID一旦在facebook、G+等网站上被使用,就不能再被其他人使用。

有一天,用户想要从网站上删除他/她的个人资料。 现在,如果您进行逻辑删除,该用户将无法再次注册。 此外,使用相同的 EmailID 再次注册并不意味着恢复整个历史记录。 大家都知道,删除就是删除。 在这种情况下,我们必须进行物理删除。 但为了维护帐户的整个历史记录,我们应该始终将这些记录存档在存档表或已删除表中。

是的,在我们有很多外国表的情况下,处理是相当麻烦的。

另请记住,软/逻辑删除会增加表大小,即索引大小。

Well! As everyone said, it depends on the situation.

If you have an index on a column like UserName or EmailID - and you never expect the same UserName or EmailID to be used again; you can go with a soft delete.

That said, always check if your SELECT operation uses the primary key. If your SELECT statement uses a primary key, adding a flag with the WHERE clause wouldn't make much difference. Let's take an example (Pseudo):

Table Users (UserID [primary key], EmailID, IsDeleted)

SELECT * FROM Users where UserID = 123456 and IsDeleted = 0

This query won't make any difference in terms of performance since the UserID column has a primary key. Initially, it will scan the table based on PK and then execute the next condition.

Cases where soft deletes cannot work at all:

Sign-up in majorly all websites take EmailID as your unique identification. We know very well, once an EmailID is used on a website like facebook, G+, it cannot be used by anyone else.

There comes a day when the user wants to delete his/her profile from the website. Now, if you make a logical delete, that user won't be able to register ever again. Also, registering again using the same EmailID wouldn't mean to restore the entire history. Everyone knows, deletion means deletion. In such scenarios, we have to make a physical delete. But in order to maintain the entire history of the account, we should always archive such records in either archive tables or deleted tables.

Yes, in situations where we have lots of foreign tables, handling is quite cumbersome.

Also keep in mind that soft/logical deletes will increase your table size, so the index size.

等风来 2024-07-18 17:06:02

我已经在另一篇文章中回答过。
不过,我认为我的回答更适合这里的问题。

我的软删除实用解决方案是通过创建新的归档
包含以下列的表:original_idtable_namepayload
(以及可选的主键“id”)。

其中original_id是删除记录的原始id,table_name
是已删除记录的表名称(在您的情况下是“user”),
payload 是来自已删除的所有列的 JSON 字符串化字符串
记录一下。

我还建议在 original_id 列上为后者创建索引
数据检索。

通过这种方式归档数据。 您将拥有这些优势

  • 跟踪历史记录中的所有数据
  • 只有一个位置可以存档任何表中的记录,无论已删除记录的表结构如何
  • 不用担心原表中的唯一索引
  • 不用担心检查原表中的外部索引
  • 每个查询中不再需要检查删除的 WHERE 子句

这已经是一个讨论
此处解释原因
软删除在实践中并不是一个好主意。 软删除介绍
未来一些潜在的问题,例如计数记录,...

I have already answered in another post.
However, I think my answer more fit to the question here.

My practical solution for soft-delete is archiving by creating a new
table with following columns: original_id, table_name, payload,
(and an optional primary key `id).

Where original_id is the original id of deleted record, table_name
is the table name of the deleted record ("user" in your case),
payload is JSON-stringified string from all columns of the deleted
record.

I also suggest making an index on the column original_id for latter
data retrievement.

By this way of archiving data. You will have these advantages

  • Keep track of all data in history
  • Have only one place to archive records from any table, regardless of the deleted record's table structure
  • No worry of unique index in the original table
  • No worry of checking foreign index in the original table
  • No more WHERE clause in every query to check for deletion

The is already a discussion
here explaining why
soft-deletion is not a good idea in practice. Soft-delete introduces
some potential troubles in future such as counting records, ...

自由如风 2024-07-18 17:06:02

这取决于具体情况,请考虑以下因素:

通常,您不需要“软删除”记录。
保持简单快速。
例如,删除不再可用的产品,这样您就不必检查该产品是否在整个应用程序中被软删除(计数、产品列表、推荐产品等)。

但是,您可能会考虑数据仓库模型中的“软删除”。 例如您正在查看已删除产品的旧收据。*

It depends on the case, consider the below:

Usually, you don't need to "soft-delete" a record.
Keep it simple and fast.
e.g. Deleting a product no longer available, so you don't have to check the product isn't soft-deleted all over your app (count, product list, recommended products, etc.).

Yet, you might consider the "soft-delete" in a data warehouse model. e.g. You are viewing an old receipt on a deleted product.*

长亭外,古道边 2024-07-18 17:06:02

优点是数据保存/永久化。 缺点是从具有大量软删除的表中查询或检索数据时性能会下降。

在我们的例子中,我们使用两者的组合:正如其他人在之前的答案中提到的那样,我们软删除例如用户/客户/客户,以及硬删除删除 items/products/merchandise 表中存在不需要保留的重复记录。

Advantages are data preservation/perpetuation. A disadvantage would be a decrease in performance when querying or retrieving data from tables with significant number of soft deletes.

In our case we use a combination of both: as others have mentioned in previous answers, we soft-delete users/clients/customers for example, and hard-delete on items/products/merchandise tables where there are duplicated records that don't need to be kept.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文