轻松恢复删除

发布于 2024-12-23 02:39:07 字数 1523 浏览 5 评论 0原文

支持数据服务的反删除或延迟/批量删除是相当常见的需求。我想知道如何以 RESTful 方式实现这一点。我在几个不同的选择之间左右为难（没有一个对我来说似乎非常有吸引力）。我认为，这些不同选项的共同点是需要一个 API，该 API 返回针对特定资源类型标记为已删除的所有资源。

以下是我考虑过的一些选项及其一些优点/缺点：

将资源标记为已删除的选项：

使用 HTTP DELETE 将资源标记为已删除。
使用 HTTP PUT/POST 更新已删除标志。这感觉不对，因为它将本质上是删除的内容从 HTTP DELETE 方法映射到其他 HTTP 方法。

获取标记为删除的资源时的选项：

为标记为删除的资源返回 HTTP 状态 404。清洁&透明，但我们如何区分真正删除的资源与刚刚标记为删除的资源之间的区别。
返回 HTTP 状态 410。提供了区分差异的方法，但 410 从技术上讲它“预计将被视为永久的。具有链接编辑功能的客户端应该在用户批准后删除对 Request-URI 的引用。”这里的“预期”和“应该”这两个词可能有足够的回旋余地。不确定 410 在客户端中的支持/理解程度如何。
返回 HTTP 状态 200 并包含指示资源已删除的标志字段。这看起来很奇怪，因为首先删除它的想法是因为您实际上希望它不出现。这将过滤已删除资源的责任推给了客户端。

包含此已删除资源的响应选项：

忽略标记为已删除的资源。清洁&简单的。但如果您确实想了解已删除的资源该怎么办？
将它们与指示它们已删除的字段一起包含在内。这将过滤已删除资源的责任推给了客户端。如果您只想对活动或已删除的资源进行分页，那么分页会变得很棘手。

更新标记为删除的资源时的选项：

使用 HTTP 状态 404。资源消失了，对吧？但是，如何区分标记为已删除的资源和实际删除的资源之间的区别。 404 响应中的 HTTP 正文可以在此处消除歧义，但随后客户端需要解析/解释您的正文来消除歧义。也许响应头可能有帮助？哪一个？自定义标头？
使用 HTTP 状态 409 以及有关如何首先取消删除资源的消息。

取消删除标记为删除的资源的选项：

使用 HTTP PUT/POST 进行资源更新操作并将其再次标记为活动状态。仅当您不为资源的 GET 操作返回 HTTP 404 时，此方法才有效，因为它不会对“未找到”(404) 的资源进行 PUT/POST。
使用 HTTP PUT/POST 进行资源的创建操作。这里的问题是哪些数据优先？ create操作中发送的数据？或者正在取消删除的数据？将其从任何其他会返回它的查询中过滤掉。然后，如果资源标识符指向标记为已删除的资源，则将创建资源的 HTTP PUT/POST 视为取消删除。
专用于取消删除标记为删除的资源的单独 REST 路径。

这绝不是一份详尽的清单。我只是想列举一些一直在我脑海中浮现的选项。

我知道如何做到这一点的答案是，像往常一样，“这取决于”。我很好奇的是，您会根据什么资格/要求来做出决定？您如何看待这个实施或您自己实施它？

原文

It is a fairly common requirement to support undeletes or delayed/batched deletions for data services. What I'm wondering is how to implement this in a RESTful way. I'm torn between a few different options (none of which seems terribly attractive to me). Common across these different options, I believe, is the need for an API which returns all resource marked as deleted for a particular resource type.

Here are some options I've thought about and some of their pros/cons:

Options to mark resource as deleted:

Use HTTP DELETE to mark the resource as deleted.
Use HTTP PUT/POST to update deleted flag. This doesn't feel right since it maps what is essentially a deletion away from the HTTP DELETE method and into other HTTP methods.

Options when GET-ing resource marked for deletion:

Return HTTP Status 404 for a resource marked as deleted. Clean & transparent, but how do we tell the difference between a resource that is really deleted vs. one just marked as deleted.
Return HTTP Status 410. Provides way to tell the difference, but 410 technically says it "is expected to be considered permanent. Clients with link editing capabilities SHOULD delete references to the Request-URI after user approval." There may be enough wiggle room in the words "expected" and "SHOULD" here. Not sure how well 410 is supported/understood out there in clients.
Return HTTP Status 200 and include flag field indicating resource is deleted. This seems wierd since the idea of deleting it in the first place was because you actually wanted it to not appear. This pushes responsibility for filtering out deleted resources down to clients.

Options for responses which include this deleted resource:

Omit the resources makred as deleted. Clean & simple. But what if you actually want to know about deleted resources.
Include them along with field indicating that they are deleted. This pushes responsibility for filtering out deleted resources down to clients. It makes pagination tricky if you want to only page through active or deleted resources.

Options when updating a resource marked for deletion:

Use HTTP Status 404. The resource is gone right? But, how can you tell the difference between a resource marked as deleted and one actually deleted. HTTP body in 404 response could disambiguate here but then clients are left with parsing/interpreting your body to disambiguate. Maybe response header might help here? Which one? Custom header?
Use HTTP Status 409 with message about how resource must first be undeleted.

Options to undelete resource marked for deletion:

Use HTTP PUT/POST for update operation of resource and mark it as active again. This only works as long as you're not returning an HTTP 404 for the GET operation for the resource since it doesn't make since to PUT/POST to a resource that is "Not found" (404).
Use HTTP PUT/POST for creation operation of resource. The problem here is which data takes precedence? The data sent up in the create operation? Or the data that is being undeleted? filter it out of any other queries that would have returned it. Then, treat the HTTP PUT/POST that creates the resource as an undelete if the resource identifier points to a resource marked as deleted.
Separate REST path dedicated to undelete resources marked for deletion.

This is by no means an exhaustive list. I just wanted to enumerate some of the options that have been bouncing around in my head.

I know the answer to how to do this is, as usual, "it depends". What I'm curious about is what qualifications/requirements would you use to make your decision? How have you seen this implemented or implemented it yourself?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

牵强ㄟ 2024-12-30 02:39:07

按照书本：RFC 2616-9.7：

  The DELETE method requests that the origin server delete the resource 
  identified by the Request-URI. This method MAY be overridden by human 
  intervention (or other means) on the origin server. The client cannot
  be guaranteed that the operation has been carried out, even if the 
  status code returned from the origin server indicates that the action
  has  been completed successfully. However, the server SHOULD NOT 
  indicate success unless, at the time the response is given, if it intends
  to delete the resource or move it to an inaccessible location.

当您删除资源时服务器应在其一侧标记要删除的资源。它实际上不必删除资源，它只是不能保证操作已执行。即便如此，服务器也不应该在它尚未删除的情况下说它已被删除。

  A successful response SHOULD be 200 (OK) if the response includes an entity
  describing the status, 202 (Accepted) if the action has not yet been enacted,
  or 204 (No Content) if the action has been enacted but the response does not
  include an entity.

如果操作延迟，则发送 202 和描述操作结果的实体主体。（想象一个可轮询的“任务”，代表服务器对资源的延迟删除；理论上它可以使其永远处于该状态。） 它所要做的就是防止客户端以原始形式再次检索它. 使用 410 作为响应代码，当“任务”完成或服务器以其他方式删除资源时，返回 404。

但是，如果 DELETE 的语义对资源没有意义有问题的是，也许您正在寻找的不是删除，而是改变资源状态但保持其可访问性的添加状态转换？在这种情况下，请使用 PUT/PATCH 来更新资源并完成。

Going by the book: RFC 2616-9.7:

  The DELETE method requests that the origin server delete the resource 
  identified by the Request-URI. This method MAY be overridden by human 
  intervention (or other means) on the origin server. The client cannot
  be guaranteed that the operation has been carried out, even if the 
  status code returned from the origin server indicates that the action
  has  been completed successfully. However, the server SHOULD NOT 
  indicate success unless, at the time the response is given, if it intends
  to delete the resource or move it to an inaccessible location.

When you DELETE a resource the server should mark the resource for deletion on it's side. It doesn't really have to delete the resource, it just can't give any guarantee that the operation has been carried out. Even so, the server shouldn't say it's been deleted when it hasn't.

  A successful response SHOULD be 200 (OK) if the response includes an entity
  describing the status, 202 (Accepted) if the action has not yet been enacted,
  or 204 (No Content) if the action has been enacted but the response does not
  include an entity.

If the operation is delayed, send a 202 and an entity body describing the result of the action. (Think of a poll-able "task" representing the server's deferred deletion of the resource; It could theoretically leave it forever in that state.) All it has to do is prevent the client from retrieving it again in it's original form. Use a 410 for the response code, and when the "task" finishes or the server otherwise deletes the resource, return a 404.

However, if a DELETE's semantics don't make sense for the resource in question, perhaps it's not a deletion you're looking for, but an addition state transition that alters the resource state but keeps it accessible? In that case, use a PUT/PATCH to update the resource and be done.

回复收藏 0 原文

吾家有女初长成 2024-12-30 02:39:07

简短版本

您无法在原始 URI 上使用任何方法以 REST 方式取消删除资源 - 这是不合逻辑的，因为对已删除的资源尝试的任何操作都应返回 404 或 410。规范中没有明确说明，它强烈暗示在 DELETE 方法的定义中 1 （添加强调）：

实际上，此方法类似于 UNIX 中的 rm 命令：
表示对源端的URI映射进行删除操作
服务器而不是先前关联的期望
信息被删除。

换句话说，当您删除资源后，服务器不再将该 URI 映射到该数据。因此，您无法对其进行 PUT 或 POST 来进行更新，例如“将其标记为未删除”等（请记住，资源被定义为 URI 和某些基础数据之间的映射）。

一些解决方案

由于明确指出底层数据不一定被删除，因此并不排除服务器在 DELETE 实现中创建新 URI 映射，从而有效地制作一个可以稍后恢复的备份副本。

您可以有一个包含所有已删除项目的“/deleted/”集合 - 但您实际上将如何执行取消删除？也许最简单的 RESTful 方法是让客户端使用 GET 检索项目，然后将其 POST 到所需的 URL。

如果您需要将已删除的项目恢复到其原始位置该怎么办？如果您使用的媒体类型支持它，则可以在对 /deleted/ 集合中的 GET 的响应中包含原始 URI。然后客户端可以使用它来进行 POST。这样的响应在 JSON 中可能如下所示：

{
    "original-url":"/some/place/this/was/deleted/from",
    "body":<base64 encoded body>
}

然后，客户端可以将该正文 POST 到该 URI 以执行取消删除。

或者，如果您的资源定义允许移动的概念（通过更新“位置”属性或类似的内容），那么您可以进行部分更新并避免整个对象的往返。或者，像大多数人一样，实现类似 RPC 的操作来告诉服务器移动资源！不稳定，是的，但在大多数情况下它可能会工作得很好。

如何决定这些事情

关于如何决定这些事情的问题：您必须考虑删除在您的应用程序上下文中意味着什么，以及为什么需要它。在很多应用程序中，没有任何内容被删除，“删除”实际上只是意味着“从所有进一步的查询/列表等中排除该项目，除非我明确取消删除它”。所以，它实际上只是一个元数据，或者一个移动操作。既然如此，为什么还要费心使用 HTTP DELETE 呢？原因之一可能是，如果您想要 2 层删除 - 软或临时版本是可撤消的，而硬/永久版本是，嗯……不行。

如果没有任何特定的应用程序上下文，我倾向于像这样实现它们：

为了方便起见，我不想再看到这个资源：发布部分更新以标记资源“暂时删除”

我不希望任何人能够再访问此资源，因为它令人尴尬/有罪/花费我金钱/等等：HTTP DELETE< /strong>

下一个要考虑的问题是：永久删除是否应该仅永久取消映射 URI，以便没有人可以再链接到它，或者是否也有必要清除底层数据？显然，如果您保留数据，那么管理员甚至可以恢复“永久”删除的资源（但是不能通过任何 RESTful 接口）。这样做的缺点是，如果数据所有者确实希望清除数据，那么管理员必须在 REST 接口之外执行此操作。

The Short Version

You cannot RESTfully undelete a resource using any method on it's original URI - it's illogical, because any operation attempted on a resource that has been deleted should return either a 404 or a 410. While this is not explicitly stated in the spec, it's strongly implied in the definition of the DELETE method 1 (emphasis added):

In effect, this method is similar to the rm command in UNIX: it
expresses a deletion operation on the URI mapping of the origin
server rather than an expectation that the previously associated
information be deleted.

In other words, when you've DELETEd a resource, the server no longer maps that URI to that data. So you can't PUT or POST to it to make an update like "mark this as undeleted" etc. (Remember that a resource is defined as a mapping between a URI and some underlying data).

Some Solutions

Since it's explicitly stated that the underlying data is not necessarily deleted, it doesn't preclude the server making a new URI mapping as part of the DELETE implementation, thereby effectively making a backup copy that can be restored later.

You could have a "/deleted/" collection that contains all the deleted items - but how would you actually perform the undelete? Perhaps simplest RESTful way is to have the client retrieve the item with GET, and then POST it to the desired URL.

What if you need to be able to restore the deleted item to it's original location? If you're using a media type that supports it, you could include the original URI in the response to a GET from the /deleted/ collection. The client could then use it to POST. Such a response might look like this in JSON:

{
    "original-url":"/some/place/this/was/deleted/from",
    "body":<base64 encoded body>
}

The client could then POST that body to that URI to perform an undelete.

Alternatively, if your resource definition allows the concept of moving (by updating a "location" property or something like that) then you can do a partial update and avoid the round trip of the entire object. Or, do what most people do and implement an RPC-like operation to tell the server to move the resource! UnRESTful, yes but it will probably work fine in most situations.

How You Decide These Things

Regarding the question of how you decide these things: you have to consider what delete means in the context of your application, and why you want it. In a lot of applications, nothing ever gets deleted, and "delete" really just means "exclude this item from all further queries/listings etc. unless I explicitly undelete it". So, it's really just a piece of metadata, or a move operation. In that case, why bother with HTTP DELETE? One reason might be if you want a 2-tiered delete - a soft or temporary version that's undoable, and a hard/permanent version that's, well...not.

Absent any specific application context, I'd be inclined to implement them like this:

I don't want to see this resource any longer, for my convenience: POST a partial update to mark the resource as "temporarily deleted"

I don't want anyone to be able to reach this resource any longer because it's embarrassing/incriminating/costs me money/etc: HTTP DELETE

The next question to consider is: should the permanent delete only unmap the URI permanently, so that no one can link to it any longer, or is it necessary to purge the underlying data too? Obviously if you keep the data, then an administrator could restore even a "permanently" deleted resource (not through any RESTful interface however). The downside of this is that if the owner of the data really wants it purged, then an admin has to do that outside the REST interface.

回复收藏 0 原文

只涨不跌 2024-12-30 02:39:07

我认为解决此问题的最 RESTful 方法是使用 HTTP PUT 标记要删除（和取消删除）的资源，然后使用 HTTP DELETE 永久删除该资源。要获取标记为删除的资源列表，我将在 HTTP GET 请求中使用参数，例如。 ?state=markedForDeletion。
如果您请求不带参数标记为删除的资源，我认为您应该返回“404 Not Found”状态。

回复收藏 0 原文

卸妝后依然美 2024-12-30 02:39:07

“删除”（废弃）的项目也可以被视为资源，对吧？然后我们可以通过以下方式之一访问该资源（例如，对于已删除的用户）：

PATCH deleted_users/{id}
PATCH trash/users/{id}
PATCH deleted/users/{id}

或者有些人可能认为这是更轻松的方式：

PATCH deleted/{id}?type=users

并且在有效负载中是这样的：

{ deleted_at: null }

"Deleted" (trashed) items also may be considered as a resource, right? Then we can access this resource in one of these ways (e.g. for a deleted a user):

PATCH deleted_users/{id}
PATCH trash/users/{id}
PATCH deleted/users/{id}

or some people may think this is more restful way:

PATCH deleted/{id}?type=users

and in payload goes something like this:

{ deleted_at: null }

回复收藏 0 原文

错爱 2024-12-30 02:39:07

我也遇到了这个问题，我一直在互联网上寻找最好的解决方案。由于我能找到的主要答案对我来说似乎都不正确，因此这是我自己的研究结果。

其他人认为 DELETE 是正确的选择。您可以包含一个标志来确定是立即永久删除还是移至垃圾桶（可能只有管理员才能执行立即永久删除。）

DELETE /api/1/book/33
DELETE /api/1/book/33?permanent

然后后端可以将书标记为已删除。假设您有一个 SQL 数据库，它可能是这样的：

UPDATE books SET status = 'deleted' WHERE book_id = 33;

正如其他人提到的，一旦完成 DELETE，集合的 GET 就不会返回该项目。就 SQL 而言，这意味着您必须确保不返回状态为已删除的项目。

SELECT * FROM books WHERE status <> 'deleted';

另外，当您执行 GET /api/1/book/33 时，您必须返回 404 或 410。410 的一个问题是它意味着永远消失（在至少这是我对该错误代码的理解，）因此只要该项目存在，但被标记为“已删除”，我就会返回 404，一旦它被永久标记为 410已删除。

注意：如果您想将数据保留 X 时间，然后在 CRON 作业中将其永久删除，请使用“deleted_on”字段来包含非永久删除发生的数据。然后上面的 SQL commabd 更改为类似 ... WHERE returned_on IS NULL.

现在要取消删除，正确的方法是 PATCH。与用于更新现有项目的 PUT 相反，PATCH 预计是对现有项目的操作。据我所知，该操作预计在有效负载中进行。为了使其发挥作用，需要以某种方式访问资源。正如其他人所建议的，您可以提供一个trashcan区域，一旦删除该书就会出现在该区域。像这样的东西可以列出放入垃圾桶的书籍：

GET /api/1/trashcan/books

[{"path":"/api/1/trashcan/books/33"}]

因此，结果列表现在将包含编号为 33 的书籍，然后您可以使用以下操作PATCH：

PATCH /api/1/trashcan/books/33

{
    "operation": "undelete"
}

如果您愿意为了使操作更加通用，您可以使用以下内容：

PATCH /api/1/trashcan/books/33

{
    "operation": "move",
    "new-path": "/api/1/books/33"
}

然后，“移动”可用于界面中任何可能的 URL 的其他更改。（我正在开发一个 CMS，其中页面的路径位于一个名为 tree 的表中，每个页面位于另一个名为 page 的表中，并且具有一个标识符。我可以通过在我的 tree 表中的路径之间移动来更改页面的路径！这是 PATCH 非常有用的地方。）

不幸的是，RFC 没有明确定义 < code>PATCH，只是它要与如上所示的操作，与 PUT 相反，它接受表示目标项目的新版本（可能是部分）的有效负载：

PUT /api/1/books/33

{
    "title": "New Title Here"
}

而相应的 PATCH （如果您要支持两者）将是：

PATCH /api/1/books/33

{
    "operation": "replace",
    "field": "title",
    "value": "New Title Here"
}

我认为支持这么多 PATCH 操作将是疯狂的。但我认为一些很好的例子可以更好地理解为什么 PATCH 是正确的解决方案。

您可以将其视为：使用 patch 是更改虚拟字段或运行复杂的操作，例如移动，否则需要 GET、POST< /code>、DELETE（假设 DELETE 是立即执行的，您可能会遇到错误并最终进行部分移动...）在某种程度上，>PATCH 类似于拥有任意数量的方法。 UNDELETE 或 MOVE 方法将以类似的方式工作，但 RFC 明确表示有一组标准化方法，您当然应该坚持使用它们，PATCH 为您提供了足够的空间，不必这样做添加你的自己的方法。虽然我没有在规范中看到任何内容说你不应该添加自己的方法。不过，如果您这样做，请务必清楚地记录下来。

I'm also running in this problem and I've been looking on the Internet for what feels like the best solution. Since none of the main answers I can find seem correct to me, here is my own research results.

Others are right that the DELETE is the way to go. You could include a flag to determine whether it's immediately a permanent DELETE or a move to the trashcan (and probably only administrators can do an immediate permanent DELETE.)

DELETE /api/1/book/33
DELETE /api/1/book/33?permanent

The backend can then mark the book as deleted. Assuming you have an SQL database, it could be something such as:

UPDATE books SET status = 'deleted' WHERE book_id = 33;

As mentioned by others, once the DELETE is done, a GET of the collection does not return that item. In terms of SQL, this means you must make sure not to return an item with a status of deleted.

SELECT * FROM books WHERE status <> 'deleted';

Also, when you do a GET /api/1/book/33, you must return a 404 or 410. One problem with 410 is that it means Gone Forever (at least that's my understanding of that error code,) so I would return 404 as long as the item exists but is marked as 'deleted' and 410 once it was permanently removed.

Note: if you want to keep the data for X amount of time and then delete it permanently in a CRON job, then use a "deleted_on" field with the data the non-permanent delete happened. Then the SQL commabd above changes to something like ... WHERE deleted_on IS NULL.

Now to undelete, the correct way is to PATCH. Contrary to a PUT which is used to update an existing item, the PATCH is expected to be an operation on an existing item. From what I can see, the operation is expected to be in the payload. For that to work, the resource needs to be accessible in some way. As someone else suggested, you can provide a trashcan area where the book would appear once deleted. Something like this would work to list books that were put in the trashcan:

GET /api/1/trashcan/books

[{"path":"/api/1/trashcan/books/33"}]

So, the resulting list would now include book number 33, which you can then PATCH with an operation such as:

PATCH /api/1/trashcan/books/33

{
    "operation": "undelete"
}

If you'd like to make the operation more versatile, you could use something such as:

PATCH /api/1/trashcan/books/33

{
    "operation": "move",
    "new-path": "/api/1/books/33"
}

Then the "move" could be used for other changes of URL wherever possible in your interface. (I am working on a CMS where the path to a page is in one table called tree, and each page is in another table called page and has an identifier. I can change the path of a page by moving it between paths in my tree table! This is where a PATCH is very useful.)

Unfortunately, the RFCs do not clearly define the PATCH, only that it is to be used with an operation as shown above, opposed to a PUT which accepts a payload representing a new version, possibly partial, of the targeted item:

PUT /api/1/books/33

{
    "title": "New Title Here"
}

Whereas the corresponding PATCH (if you were to support both) would be:

PATCH /api/1/books/33

{
    "operation": "replace",
    "field": "title",
    "value": "New Title Here"
}

I think that supporting that many PATCH operations would be crazy. But I think that a few good examples give a better idea of why PATCH is the correct solution.

You can think of it as: using patch is to change a virtual field or run a complex operation such as a move which would otherwise require a GET, POST, DELETE (and that's assuming the DELETE is immediate and you could get errors and end up with a partial move...) In a way, the PATCH is similar to having any number of methods. An UNDELETE or MOVE method would work in a similar way, but the RFC clearly says there is a set of standardized methods and you should certainly stick to them and the PATCH gives you plenty of room to not have to add your own methods. Although I did not see anything in the specs saying you should not add your own methods. If you do, though, make sure to clearly document them.