SELECT DISTINCT 最多返回一行

发布于 2025-01-13 18:08:12 字数 1115 浏览 2 评论 0原文

给定以下数据库结构:

Regions

idname
1EU
2US
3SEA

Customers:

idnameregion
1peter1
2henry1
3john2

还有一个 PL/pgSQL函数到位,定义为 sendShipment(),它接受(除其他外)发送者和接收者客户 ID。

这存在一个业务限制,要求我们验证发送方和接收方是否位于同一区域 - 并且我们需要将其作为 sendShipment() 的一部分来执行。因此,在此函数中,我们需要查询客户表中的发送者和接收者 ID,并验证它们的区域 ID 是否相同。我们还需要识别自身以便进一步处理。

所以也许是这样的: SELECT DISTINCT Region FROM Customers WHERE id IN (?, ?)

这样做的问题是,结果要么是一个数组(如果客户不在同一区域内),要么是单个值。

有没有更优雅的方法来解决这个约束?我正在考虑SELECT INTO并使用临时表,或者我可以SELECT COUNT(DISTINCTregion)然后为实际的数据执行另一个SELECT如果计数小于 2,则为值,但我希望尽可能避免性能受到影响。

Given the following db structure:

Regions

idname
1EU
2US
3SEA

Customers:

idnameregion
1peter1
2henry1
3john2

There is also a PL/pgSQL function in place, defined as sendShipment() which takes (among other things) a sender and a receiver customer ID.

There is a business constraint around this which requires us to verify that both sender and receiver sit in the same region - and we need to do this as part of sendShipment(). So from within this function, we need to query the customer table for both the sender and receiver ID and verify that both their region ID is identical. We will also need to ID itself for further processing down the line.

So maybe something like this:
SELECT DISTINCT region FROM customers WHERE id IN (?, ?)

The problem with this is that the result will be either an array (if the customers are not within the same region) or a single value.

Is there are more elegant way of solving this constraint? I was thinking of SELECT INTO and use a temporary table, or I could SELECT COUNT(DISTINCT region) and then do another SELECT for the actual value if the count is less than 2, but I'd like to avoid the performance hit if possible.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

孤凫 2025-01-20 18:08:12

还有一个 PL/pgSQL 函数,定义为 sendShipment(),它接受(除其他外)发送者和接收者客户 ID。

这方面有一个业务限制,要求我们验证发送方和接收方是否位于同一区域 - 并且我们需要将其作为 sendShipment() 的一部分来执行。因此,在此函数中,我们需要查询客户表中的发送者和接收者 ID,并验证它们的区域 ID 是否相同。我们还需要识别自己的身份以便进一步处理。

此查询应该有效:

WITH q AS (
    SELECT
        COUNT( * ) AS CountCustomers,
        COUNT( DISTINCT c.Region ) AS CountDistinctRegions,
--      MIN( c.Region ) AS MinRegion
        FIRST_VALUE( c.Region ) OVER ( ORDER BY c.Region ) AS MinRegion
    FROM
        Customers AS c
    WHERE
        c.CustomerId = $senderCustomerId
        OR
        c.CustomerId = $receiverCustomerId
)
SELECT
    CASE WHEN q.CountCustomers = 2 AND q.CountDistinctRegions = 2 THEN 'OK' ELSE 'BAD' END AS "Status",
    CASE WHEN q.CountDistinctRegions = 2 THEN q.MinRegion END AS SingleRegion
FROM
    q
  • 上述查询将始终返回包含 2 列的单行:StatusSingleRegion

  • SQL 没有“SINGLE( col )”聚合函数(即函数为 NULL,除非聚合组只有一行) ,但我们可以在 CTE 或派生表中将 MIN (或 MAX)与 CASE WHEN COUNT() 滥用作为等效操作< /del>.

  • 由于您的 Region 列是 UUID,因此您不能将其与 MIN 一起使用,但我知道它应该与 FIRST_VALUE( c.Region ) OVER ( ORDER BY c.Region ) AS MinRegion.

  • 对于列:

    • 根据您提到的业务限制,Status 列可以是 'OK''BAD'。不过,您可能希望将其更改为列,而不是文本列。
    • 如果 CountDistinctRegions = 2,无论 CountCustomers 为何,SingleRegion 列将为 NOT NULL(具有有效区域) >,但请随意更改,以防万一您仍然需要该信息。

There is also a PL/pgSQL function in place, defined as sendShipment() which takes (among other things) a sender and a receiver customer ID.

There is a business constraint around this which requires us to verify that both sender and receiver sit in the same region - and we need to do this as part of sendShipment(). So from within this function, we need to query the customer table for both the sender and receiver ID and verify that both their region ID is identical. We will also need to ID itself for further processing down the line.

This query should work:

WITH q AS (
    SELECT
        COUNT( * ) AS CountCustomers,
        COUNT( DISTINCT c.Region ) AS CountDistinctRegions,
--      MIN( c.Region ) AS MinRegion
        FIRST_VALUE( c.Region ) OVER ( ORDER BY c.Region ) AS MinRegion
    FROM
        Customers AS c
    WHERE
        c.CustomerId = $senderCustomerId
        OR
        c.CustomerId = $receiverCustomerId
)
SELECT
    CASE WHEN q.CountCustomers = 2 AND q.CountDistinctRegions = 2 THEN 'OK' ELSE 'BAD' END AS "Status",
    CASE WHEN q.CountDistinctRegions = 2 THEN q.MinRegion END AS SingleRegion
FROM
    q
  • The above query will always return a single row with 2 columns: Status and SingleRegion.

  • SQL doesn't have a "SINGLE( col )" aggregate function (i.e. a function that is NULL unless the aggregation group has a single row), but we can abuse MIN (or MAX) with a CASE WHEN COUNT() in a CTE or derived-table as an equivalent operation.

  • As your Region column is UUID you cannot use it with MIN, but I understand it should work with FIRST_VALUE( c.Region ) OVER ( ORDER BY c.Region ) AS MinRegion.

  • As for the columns:

    • The Status column is either 'OK' or 'BAD' based on those business-constraints you mentioned. You might want to change it to a bit column instead of a textual one, though.
    • The SingleRegion column will be NOT NULL (with a valid region) if CountDistinctRegions = 2 regardless of CountCustomers, but feel free to change that, just-in-case you still want that info.
花心好男孩 2025-01-20 18:08:12

对于对简单解决方案感兴趣的其他人,我最终想出了一种(明显的)方法:

SELECT
    r.region
FROM
    customers s
INNER JOIN customers r ON 
    s.region = r.region 
WHERE s.id = 'sender_id' and r.id = 'receiver_id';

SELECT DISTINCT 最多返回一行 他在这方面帮了我很多忙,而且发布了可行的解决方案。

For anybody else who's interested in a simple solution, I finally came up with the (kind of obvious) way to do it:

SELECT
    r.region
FROM
    customers s
INNER JOIN customers r ON 
    s.region = r.region 
WHERE s.id = 'sender_id' and r.id = 'receiver_id';

Huge credit to SELECT DISTINCT to return at most one row who helped me out a lot on this and also posted a viable solution.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文