是否可以从复杂的数据库查询创建详细的错误消息？

发布于 2024-09-30 12:15:24 字数 1189 浏览 5 评论 0原文

让我用一个简化的例子来说明这个问题。假设我正在使用 python 和 PostgreSQL 关系数据库构建一个项目。在我的数据库中，我有两个表“parent”和“child”，它们通过表“parent_child”相互关联。我希望能够以安全的方式检索有关特定父级拥有的特定子级的一些数据，以下查询可以让我做到这一点（X、Y 和 Z 是用户提供的文字）：

SELECT child.age FROM parent, parent_child, child 
WHERE child.id = parent_child.child_id AND parent_child.id = X 
AND parent_child.parent_id = parent.id AND parent.id = Y 
AND parent.password = Z;

假设有一个用户出现如果输入了错误的 X、Y 或 Z 值，查询将返回一个空集，该空集可以被检测到，并向用户传递一条存在错误的消息。当然，问题是我无法确定哪个值导致了问题，因此无法向用户提供有关他们错误输入的内容的具体信息？

最简单的解决方案是将查询分成几个部分。首先，验证parent.id是否存在。

SELECT parent.id FROM parent WHERE parent.id = Y;

其次，检查密码是否正确。

SELECT parent.id FROM parent WHERE parent.id = Y and parent.password = Z;

第三，检查孩子是否存在。

SELECT child.id FROM child WHERE child.id = X;

第四，检查孩子是否属于父母所有并返回我们需要的信息。

SELECT child.age FROM child, parent_child WHERE parent_child.child_id = child.id AND parent_child.parent_id = Y AND parent_child.child_id = X;

这四个查询将使我们能够检查有关用户提供的信息的特定内容并在发生时报告特定问题。显然，与单个查询相比，四个查询会产生很多额外开销，而且我发现四个查询的可读性比单个查询要差。那么有没有办法两全其美呢？单个查询和详细的错误消息？

原文

Let me illustrate this question with a simplified example. Assume I am building a project using python with a PostgreSQL relational database. In my database I have two tables "parent" and "child" which are N to M related through the table "parent_child". I want to be able to retrieve some data about a specific child owned by a specific parent in a secure way, which the following query lets me do (X, Y, and Z are literals supplied by the user):

SELECT child.age FROM parent, parent_child, child 
WHERE child.id = parent_child.child_id AND parent_child.id = X 
AND parent_child.parent_id = parent.id AND parent.id = Y 
AND parent.password = Z;

Say a user comes along and types in the wrong value for either X, Y, or Z, the query would return an empty set, which could be detected and a message passed on to the user that there was an error. The problem of course is that I am unable to determine which value is causing problems and hence can not supply the user with specific information about what they mis-entered?

The simplest solution to this is to break up the query into several parts. First, verifying that the parent.id exists.

SELECT parent.id FROM parent WHERE parent.id = Y;

Second, checking if the password is correct.

SELECT parent.id FROM parent WHERE parent.id = Y and parent.password = Z;

Third, checking if the child exists.

SELECT child.id FROM child WHERE child.id = X;

Fourth, checking that the child is owned by the parent and returning the information we need.

SELECT child.age FROM child, parent_child WHERE parent_child.child_id = child.id AND parent_child.parent_id = Y AND parent_child.child_id = X;

These four queries will allow us to check specific things about the user supplied information and report specific problems as they occur. Obviously there is a lot of additional overhead in four queries verses a single query and I find four queries less readable than a single one. So is there anyway to have the best of both worlds? A single query and detailed error messages?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

痴者 2024-10-07 12:15:24

SELECT  p.id, p2.z AS pw, pc.parent_id, CASE p2.z WHEN p.pw THEN c.age END AS age
FROM    (VALUES (1)) AS p1(y)
LEFT JOIN
        parent p
ON      p.id = p1.y
LEFT JOIN
        (VALUES ('pw1')) AS p2(z)
ON      p2.z = p.pw
CROSS JOIN
        (VALUES(1)) AS p3(x)
LEFT JOIN
        child c
ON      c.id = p3.x
LEFT JOIN
        parent_child pc
ON      pc.parent_id = p.id
        AND pc.child_id = c.id

适当列中的NULL 意味着适当的条件失败。

SELECT  p.id, p2.z AS pw, pc.parent_id, CASE p2.z WHEN p.pw THEN c.age END AS age
FROM    (VALUES (1)) AS p1(y)
LEFT JOIN
        parent p
ON      p.id = p1.y
LEFT JOIN
        (VALUES ('pw1')) AS p2(z)
ON      p2.z = p.pw
CROSS JOIN
        (VALUES(1)) AS p3(x)
LEFT JOIN
        child c
ON      c.id = p3.x
LEFT JOIN
        parent_child pc
ON      pc.parent_id = p.id
        AND pc.child_id = c.id

NULLs in appropriate columns would mean that the appropriate conditions failed.

回复收藏 0 原文

眼眸里的快感 2024-10-07 12:15:24

好吧，这里的问题在于查询实际上并没有出错——它每次都为您提供符合您的条件的正确信息。因此，如果不单独检查每个查询，确实没有办法知道。

您也许可以检查是否没有行，然后运行其他查询以找出原因，这会减少您的开销。

回复收藏 0 原文

南街九尾狐 2024-10-07 12:15:24

这四个查询将使我们能够检查有关用户提供的信息的具体内容，并在发生时报告具体问题。

是的，这是标准程序（并且它的存在是有原因的。假设您是更新行：您可能会用尽各种服务器资源，例如事务日志，却发现它失败了，并且在尝试下一个级别之前始终检查每个级别，这是完全可以避免的。或更新任何内容，直到您执行完整验证。除非您确定它会成功，否则不要尝试任何内容。在这种情况下，您不会更新，但标准允许您以通常的方式尽早隔离错误。，并避免浪费资源（由于早期的失败，在后面的级别）

显然，与单个查询相比，四个查询有很多额外的开销

我不明白你的算术。如果一个表不在数据缓存中，则 PK 花费 50 个资源单位，如果在数据缓存中，则花费 2 个资源单位。假设 PostgreSQL 有数据缓存和多线程引擎，并且您的代码段是连续序列（是否存储过程）：

第一个语句 = 50
第二个语句（因为页面在缓存中） = 2
第三个语句 = 50
第四个语句（因为父级和子级都在缓存中）= 2 + 2 + 50
等于 156 个单位
更重要的是，在发生错误的情况下，成本（取决于错误发生的位置）位于）是 50 或 52 或 102 个单位
而独立的第四条语句成本为 150单元

我发现四个查询的可读性比单个查询的可读性差。

如果需要提高可读性，请在中间添加一些空格和注释。（无论如何，你的代码对其他人来说很难阅读；我会对其进行格式化。）

单个查询和详细的错误消息？

好吧，你会收到详细的错误，不多也不少；您要求的是将错误隔离到代码中的特定点（或用户请求）。如果您正在编写一个通用的存储过程，并返回一个错误代码，则需要我确定的序列。

任何其他方法（我确信有复杂且狡猾的方法）都会（a）甚至更多的开销和（b）将不必要的复杂性引入到简单的行人需求中，因此难以维护。

These four queries will allow us to check specific things about the user supplied information and report specific problems as they occur.

Yes, that is the standard procedure (and it is there for a reason. Let's say you were updating rows: you would have used up all sorts of server resources, eg the transaction log, only to find that it failed, and roll the whole thing back. Quite avoidable. Always check each level, before attempting the next level. Never lock up or update anything until you have performed full verification. Never attempt anything unless you are sure that it will succeed. In this case, you are not updating, but the standard allows you to isolate the error, in the usual way, at the earliest moment, and avoid wasting resources (at later levels due to the earlier failure).

Obviously there is a lot of additional overhead in four queries verses a single query

I do not understand your arithmetic. Let's say each query to a table by PK costs 50 resource units if it is not in the data cache, 2 units if it is. Assuming PostgreSQL has a data cache, and a multi-threaded engine, and your code segment is a contiguous sequence (stored proc or not):

first statement = 50
second statement (since the page is in cache) = 2
third statement = 50
fourth statement (since parent and child are in cache) = 2 + 2 + 50
equals 156 units
more important, in the case of error, the cost (depending on where the error is located) is 50 or 52 or 102 units
whereas the standalone fourth statement costs 150 units

I find four queries less readable than a single one.

Put some white space and commentary in-between if you need to improve readability. (Your code is hard for others to read anyway; I would format it.)

A single query and detailed error messages?

Well, you are getting detailed errors, no more no less; what you are asking for is isolation of the error to a particular point in your code (or the user request). If you were writing a stored proc for generic use, and returned an error code, the sequence I have identified would be demanded.

Any other method (and I am sure there are complex and devious methods) would be (a) even more overhead and (b) introduce unnecessary complexity into a simple pedestrian requirement, and therefore be hard to maintain.

回复收藏 0 原文

~没有更多了~