在存在许多行的 JOIN 中仅匹配一个特定行

发布于 2024-08-06 06:58:20 字数 918 浏览 8 评论 0原文

(优势数据库服务器)我有一个服务提供商表,出于审计目的,该表永远不会被删除。他们有开始日期和结束日期;如果发生名称或地址等更改,则会在现有行中添加结束日期,创建新行,并为更改的数据分配新的开始日期。

在处理向这些提供商付款的过程中,我需要一个摘要页面,其中列出提供商名称、地址、标识符 (ProvID) 和支付总额。这是通过使用 SUM() 和 GROUP BY 的相当简单的查询来完成的。

当指定的提供程序标识符有两行或更多行时,就会出现此问题。我最终得到了重复的行(如果不被发现,可能会导致向该提供商多次付款)。

我的第一个想法是使用像子选择这样的东西(丑陋,但执行速度相当快):

SELECT ... FROM service s
INNER JOIN provider p ON p.ProvID = s.ProvID
AND (p.EndDate IS NULL or p.EndDate = (SELECT Max(EndDate) FROM
   provider lu WHERE lu.ProvID = s.ProvID))

不幸的是,这最终仍然找到了两行;一行表示 NULL EndDate,一行表示 MAX(EndDate)。

我在其他情况下处理此问题(例如,为在特定日期提供的服务查找正确的 ProvID),使用

p.EndDate is null or (s.ServiceDate BETWEEN p.StartDate AND p.EndDate)

不幸的是,由于问题查询是带有聚合的 GROUP BY,因此服务日期不可用。

有什么建议吗?

编辑:我正在寻找的是具有 NULL EndDate 的行(如果存在),或者是具有 Max(EndDate) 的行(如果 NULL 行不存在)。例如,这涵盖了供应商昨天被终止但上周确实工作的情况,我们将在下周向他们付款。

(Advantage Database Server) I have a table of service providers that, for auditing purposes, are never deleted. They have a start date and end date; in the case of changes like name or address, the existing row is end dated, a new row is created, and a new start date is assigned for the changed data.

During processing of payments to those providers, I need a summary page that lists the provider name, address, identifier (ProvID), and total amount being paid. This is done in a fairly straightforward query with a SUM() and GROUP BY.

The problem appears when there are two or more rows for a specified provider identifier. I end up with duplicate rows (which could result in multiple payments to that provider if not caught).

My first thought was to use something (ugly, but performs reasonably quickly) like a subselect:

SELECT ... FROM service s
INNER JOIN provider p ON p.ProvID = s.ProvID
AND (p.EndDate IS NULL or p.EndDate = (SELECT Max(EndDate) FROM
   provider lu WHERE lu.ProvID = s.ProvID))

Unfortunately, this still ended up finding two rows; one row for the NULL EndDate and one for the MAX(EndDate).

I handle this in other cases (eg., locating the proper ProvID for a service provided on a specific date) using

p.EndDate is null or (s.ServiceDate BETWEEN p.StartDate AND p.EndDate)

Unfortunately, since the problem query is a GROUP BY with an aggregate, the service date isn't available.

Any suggestions?

EDIT: What I'm looking for is either the row with the NULL EndDate if it exists, OR the row with the Max(EndDate) if the NULL row doesn't exist. This covers the case, for instance, where a supplier was terminated yesterday, but did work last week, and we'll be paying them next week.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

伴我心暖 2024-08-13 06:58:20

所以我想如果有一行具有 NULL 结束日期,您想要该行,否则您想要具有最大结束日期的行?

我不确定 ADS,但以下内容适用于 SQL Server:

SELECT ... FROM service s
INNER JOIN provider p ON p.ProvID = s.ProvID
AND (COALESCE(p.EndDate, '2037-01-01') = (
   SELECT Max(COALESCE(EndDate, '2037-01-01')) FROM
   provider lu WHERE lu.ProvID = s.ProvID)
)

COALESCE 运算符返回第一个非空参数,因此这基本上只是将空值设置为遥远的未来时间,以便 SELECT MAX 将为您提供带有 NULL 结束日期的参数(如果有)。

So I guess if there is a row with NULL end date, you want that one, otherwise you want the one with the largest end date?

I'm not sure about ADS, but the following would work on SQL Server:

SELECT ... FROM service s
INNER JOIN provider p ON p.ProvID = s.ProvID
AND (COALESCE(p.EndDate, '2037-01-01') = (
   SELECT Max(COALESCE(EndDate, '2037-01-01')) FROM
   provider lu WHERE lu.ProvID = s.ProvID)
)

The COALESCE operator returns the first non-null parameter, so this is basically just setting the nulls to a time far in the future, so that SELECT MAX will give you the one with the NULL end date if there is one.

メ斷腸人バ 2024-08-13 06:58:20

在第二个条件中,只有在没有 NULL EndDate 时才必须获取最大值

SELECT ... FROM service s
INNER JOIN provider p ON p.ProvID = s.ProvID
AND (   p.EndDate IS NULL 
     or (p.EndDate = (SELECT Max(EndDate) 
                        FROM provider lu 
                       WHERE lu.ProvID = s.ProvID)
         AND NOT EXISTS (SELECT NULL 
                           FROM provider lu 
                          WHERE lu.ProvID = s.ProvID 
                            AND lu.EndDate IS NULL)
        )
    )

in the 2nd condition, you have to get the max only if there is no NULL EndDate

SELECT ... FROM service s
INNER JOIN provider p ON p.ProvID = s.ProvID
AND (   p.EndDate IS NULL 
     or (p.EndDate = (SELECT Max(EndDate) 
                        FROM provider lu 
                       WHERE lu.ProvID = s.ProvID)
         AND NOT EXISTS (SELECT NULL 
                           FROM provider lu 
                          WHERE lu.ProvID = s.ProvID 
                            AND lu.EndDate IS NULL)
        )
    )
嘿哥们儿 2024-08-13 06:58:20

也许使用子查询代替第二个表:

SELECT ... FROM service s
INNER JOIN (SELECT ..., Max(EndDate) FROM
   provider lu WHERE lu.ProvID = s.ProvID GROUP BY ...) p ON p.ProvID = s.ProvID

这是假设如果没有最大结束日期,您将得到 NULL。

Perhaps use a subquery in place of the second table:

SELECT ... FROM service s
INNER JOIN (SELECT ..., Max(EndDate) FROM
   provider lu WHERE lu.ProvID = s.ProvID GROUP BY ...) p ON p.ProvID = s.ProvID

This is assuming you'd get NULL back if there is no max enddate.

岁月苍老的讽刺 2024-08-13 06:58:20

您指的是数据仓库的 Type-2 维度。

您必须通过 ID 通过 StartDate 和 EndDate 加入才能获取正确的数据。

奥托姆代码

SELECT TransactionId, TransactionType
FROM TransactionList Tx
    INNER JOIN TransactionType TxType
        ON Tx.TransactionTypeId = TxType.TxTypeId
        AND Tx.TransactionDate Between TxType.StartDate and TxType.EndDate

What you are referring to is a Type-2 Dimension of a Data Warehouse.

You have to join by the ID and by the StartDate and EndDate to get the proper data.

OTTOMH code

SELECT TransactionId, TransactionType
FROM TransactionList Tx
    INNER JOIN TransactionType TxType
        ON Tx.TransactionTypeId = TxType.TxTypeId
        AND Tx.TransactionDate Between TxType.StartDate and TxType.EndDate
我要还你自由 2024-08-13 06:58:20

您的提供商表中的什么表示当前日期? EndDate=NULL、EndDate=Max(EndDate) 或 EndDate='9999-01-01'?这三个都是有效的选择,但这应该是明确的,因为如果不是这样,无论您如何巧妙地设计这个特定的查询,您都会在查询中始终得到重复的行。所以我建议在提供者表中修复这个问题,然后类似这样的事情应该可以工作:

select p.name, p.address, p.id, sum(s.amount)
  from provider p
  join service s on p.id=s.provider_id
where p.endDate is NULL
group by p.name, p.address, p.id

What in your provider table denotes the current date? EndDate=NULL, EndDate=Max(EndDate) or EndDate='9999-01-01'? All three are valid choices, but this should really be unambiguous, since if it's not you're going to end up with duplicate rows in queries all the time, no matter how cleverly you craft this particular query. So i suggest fixing that in the provider table, and then something like this should work:

select p.name, p.address, p.id, sum(s.amount)
  from provider p
  join service s on p.id=s.provider_id
where p.endDate is NULL
group by p.name, p.address, p.id
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文