带条件的多个键的左连接

发布于 2025-01-13 04:26:55 字数 1324 浏览 5 评论 0原文

如何为 Table1 (dps) 中出现的每一行返回一行，连接 Table2(dsd) 中的列（如果它们存在）？

可以使用 x, y , id 完成连接（dsd.id 在 Table2 上可以为 NULL）。对于每一对x,y 可以有多个id。我需要按时间范围进行过滤，time 列仅出现在 Table2 (dsd) 中。

我在 WHERE 子句中添加了 s.time IS NULL，因为 Table2 (dsd) 中的每一行都没有 Table1 中的行 (dps）。但我想删除 Table2 中 dsd.time IS NULL 的行。

SELECT
      s.time
      , dps.x
      , dps.y
      , dps.id AS metric
      , dps.anzahl AS produced
      , s.read_tags AS read_tags
    FROM
      dps
     LEFT JOIN  
        (SELECT MAX(dsd.time) AS time, dsd.x, dsd.y, dsd.id, COUNT(DISTINCT dsd.tagid) AS read_tags 
          FROM dsd
          WHERE (time IS NOT NULL)
          GROUP BY dsd.x, dsd.y, dsd.id
        ) AS s
      ON ( s.x = dps.x AND s.y = dps.y AND (s.id = dps.id OR s.id IS NULL) )           
    WHERE s.time BETWEEN valueA AND valueB
        OR s.time IS NULL
    ORDER BY  s.time

也许不同类型的连接更好？

编辑：现在我只需要按 s 查询的 sx 和 sy 的最小值和最大值过滤结果，所以我只得到时间范围内的结果（对于 s.time 为 NULL 的情况）。因此，dps.x 应介于 MIN(sx) 和 MAX(sx) 之间，并且 dps.y 也应如此>

原文

How can I return one row for every row appearing in Table1 (dps) joining columns from Table2(dsd) in case they exist?

The join can be done with x, y , id (dsd.id can be NULL on Table2).
For every x,y pair can be more than one id.
I need to filter by a time range with time column only appearing in Table2 (dsd).

I added s.time IS NULL in the WHERE clause because there are not rows in Table2 (dsd) for every row in Table1 (dps). But I would like to remove rows in Table2 where the dsd.time IS NULL.

SELECT
      s.time
      , dps.x
      , dps.y
      , dps.id AS metric
      , dps.anzahl AS produced
      , s.read_tags AS read_tags
    FROM
      dps
     LEFT JOIN  
        (SELECT MAX(dsd.time) AS time, dsd.x, dsd.y, dsd.id, COUNT(DISTINCT dsd.tagid) AS read_tags 
          FROM dsd
          WHERE (time IS NOT NULL)
          GROUP BY dsd.x, dsd.y, dsd.id
        ) AS s
      ON ( s.x = dps.x AND s.y = dps.y AND (s.id = dps.id OR s.id IS NULL) )           
    WHERE s.time BETWEEN valueA AND valueB
        OR s.time IS NULL
    ORDER BY  s.time

Maybe a different type of join is better?

EDIT: now I just need to filter the results by min&max of s.x and s.y of the s query, so I only get results within the time range (for those cases where s.time is NULL). So dps.x should be between MIN(s.x) and MAX(s.x) and the same for dps.y

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

雨夜星沙 2025-01-20 04:26:55

除了 ID 匹配或 NULL 的额外 LEFT JOIN 条件之外，您最初的尝试看起来非常接近。

为了澄清，JOIN 子句说明了表如何相关。在您的例子中，您试图匹配 3 个可能的唯一关键部分：X、Y 和 ID。然后，通过拥有 LEFT JOIN 是正确的意图和上下文，您想要第一个（左）表中的所有内容，无论右侧是否匹配（子查询的别名）。 WHERE 子句是您只需要 NULL 检查的地方，并且您通过检查 null 时间来完成此操作。如果 X、Y 或 ID 作为一个集合没有匹配项，则 Time 列（或 s 别名中的任何列）将为空。您不需要只检查 ID 列是否为空。

但是，如果存在匹配，那么您将有一个时间，从而根据设置的 A 和 B 参数值范围进行检查。

SELECT
        s.time, 
        dps.x, 
        dps.y, 
        dps.id metric, 
        dps.anzahl produced, 
        s.read_tags 
    FROM
        dps
            LEFT JOIN  
            ( SELECT 
                    dsd.x, 
                    dsd.y, 
                    dsd.id, 
                    MAX(dsd.time) AS time, 
                    COUNT(DISTINCT dsd.tagid) AS read_tags 
                FROM 
                    dsd
                WHERE 
                    dsd.time IS NOT NULL
                GROUP BY 
                    dsd.x, 
                    dsd.y, 
                    dsd.id ) s
                    ON  dps.x = s.x  
                    AND dps.y = s.y 
                    AND dps.id = s.id
    WHERE
            s.time IS NULL
        OR s.time BETWEEN valueA AND valueB
    ORDER BY
        s.time

最后，按顺序。您可以选择如何优化。就像现在一样，由于您按时间排序，所以行几乎可以按任何顺序返回，除了所有空值可能首先浮到顶部，因为它们缺少值，而发现的所有其他行都有值。

如果您希望首先所有匹配项，您可能希望通过

ORDER BY
   case when s.time is null then 2 else 1 end,
   s.time

这种方式排序，它强制所有 NULL 记录到结果集的第二部分（通过 then 2），否则，首先列出所有其他具有匹配项的值（通过否则1）。

我不太明白你所说的“X”值的最小值/最大值是什么意思。然而，这应该只是在你的外部 WHERE 子句中，比如

WHERE
       (existing criteria)
   AND dps.x >= SomeValue
   AND dps.x <= SomeOtherLimitValue

我不知道你会如何考虑 MIN(sx) 或 MAX(sx) 。除了限制你想要的输出之外，这的基础是什么专注于特定范围。

Your original attempt appeared very close except for the extra the LEFT JOIN condition of the ID match or NULL.

To clarify, the JOIN clause is stating how the tables are related. In your case, you were trying to match up on the 3 possible unique key parts, the X, Y and ID. Then, by having the LEFT JOIN is proper intent and context that you wanted everything from the first (left) table regardless of a match in the right (alias s of your subquery). The WHERE clause is where you just needed the NULL check, and you did that with your check on the null time. If there was no match on X, Y or ID as a set, the Time column (or any column from the s alias) would have been null. You did not need to just check on the ID column being null.

However, if there WAS a match, then you WOULD have a time and thus being checked against your A and B parameter value range as set.

SELECT
        s.time, 
        dps.x, 
        dps.y, 
        dps.id metric, 
        dps.anzahl produced, 
        s.read_tags 
    FROM
        dps
            LEFT JOIN  
            ( SELECT 
                    dsd.x, 
                    dsd.y, 
                    dsd.id, 
                    MAX(dsd.time) AS time, 
                    COUNT(DISTINCT dsd.tagid) AS read_tags 
                FROM 
                    dsd
                WHERE 
                    dsd.time IS NOT NULL
                GROUP BY 
                    dsd.x, 
                    dsd.y, 
                    dsd.id ) s
                    ON  dps.x = s.x  
                    AND dps.y = s.y 
                    AND dps.id = s.id
    WHERE
            s.time IS NULL
        OR s.time BETWEEN valueA AND valueB
    ORDER BY
        s.time

Finally, the order by. Your choice on how to optimize that. As it is now, since you order by the time, the rows could come back in almost any order except that all the null might float to the top first because they lack a value vs all others that ARE found have a value.

If you want all matches first, you might want to order by

ORDER BY
   case when s.time is null then 2 else 1 end,
   s.time

This way, it forces all NULL records to the SECOND part of the result set (via then 2), otherwise, all other values that DO have a match are listed first (via else 1).

I dont quite get what you mean about a min/max of "X" value. However, that should just be at your outer WHERE clause, such as

WHERE
       (existing criteria)
   AND dps.x >= SomeValue
   AND dps.x <= SomeOtherLimitValue

I don't know how you would be considering a MIN(s.x) or MAX(s.x).. What is the basis of this other than to limit your output of wanting to focus in on a specific range.

回复收藏 0 原文

~没有更多了~