MySQL 加入不存在的地方

发布于 2024-07-17 06:29:45 字数 832 浏览 6 评论 0原文

的 MySQL 查询,

  • 我有一个连接两个表Voters
  • Households

它们连接在 voters.household_idhousehold.id 上。

现在我需要做的是修改它,将选民表与名为 Elimination 的第三个表沿着 voter.idelimination.voter_id 连接起来。 然而,问题是我想排除选民表中在淘汰表中具有相应记录的任何记录。

我如何编写查询来执行此操作?

这是我当前的查询:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'
ORDER BY `Last_Name` ASC
LIMIT 30 

I have a MySQL query that joins two tables

  • Voters
  • Households

They join on voters.household_id and household.id.

Now what I need to do is to modify it where the voter table is joined to a third table called elimination, along voter.id and elimination.voter_id. However the catch is that I want to exclude any records in the voter table that have a corresponding record in the elimination table.

How do I craft a query to do this?

This is my current query:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'
ORDER BY `Last_Name` ASC
LIMIT 30 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

述情 2024-07-24 06:29:45

我可能会使用LEFT JOIN,即使没有匹配,它也会返回行,然后您可以通过检查NULL来仅选择不匹配的行。

因此,类似于:

SELECT V.*
FROM voter V LEFT JOIN elimination E ON V.id = E.voter_id
WHERE E.voter_id IS NULL

这比使用子查询效率更高还是更低取决于优化、索引、每个投票者是否有可能进行多次淘汰等。

I'd probably use a LEFT JOIN, which will return rows even if there's no match, and then you can select only the rows with no match by checking for NULLs.

So, something like:

SELECT V.*
FROM voter V LEFT JOIN elimination E ON V.id = E.voter_id
WHERE E.voter_id IS NULL

Whether that's more or less efficient than using a subquery depends on optimization, indexes, whether its possible to have more than one elimination per voter, etc.

默嘫て 2024-07-24 06:29:45

我会使用“不存在的地方”——正如您在标题中所建议的那样:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'

AND NOT EXISTS (
  SELECT * FROM `elimination`
   WHERE `elimination`.`voter_id` = `voter`.`ID`
)

ORDER BY `Last_Name` ASC
LIMIT 30

这可能比执行左连接稍快(当然,取决于您的索引、表的基数等),并且几乎当然比使用 IN 快得多

I'd use a 'where not exists' -- exactly as you suggest in your title:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'

AND NOT EXISTS (
  SELECT * FROM `elimination`
   WHERE `elimination`.`voter_id` = `voter`.`ID`
)

ORDER BY `Last_Name` ASC
LIMIT 30

That may be marginally faster than doing a left join (of course, depending on your indexes, cardinality of your tables, etc), and is almost certainly much faster than using IN.

源来凯始玺欢你 2024-07-24 06:29:45

有三种可能的方法可以做到这一点。

  1. 选项

    从 table_left lt 中选择 lt.* 
      左连接 
          右表 rt 
      ON rt.值 = lt.值 
      WHERE rt.value 为 NULL 
      
  2. 选项

    从 table_left lt 中选择 lt.* 
      lt.value 不在的地方 
      ( 
      选择值 
      FROM table_right rt 
      ) 
      
  3. 选项

    从 table_left lt 中选择 lt.* 
      不存在的地方 
      ( 
      选择空值 
      FROM table_right rt 
      其中 rt.value = lt.value 
      ) 
      

There are three possible ways to do that.

  1. Option

    SELECT  lt.* FROM    table_left lt
    LEFT JOIN
        table_right rt
    ON      rt.value = lt.value
    WHERE   rt.value IS NULL
    
  2. Option

    SELECT  lt.* FROM    table_left lt
    WHERE   lt.value NOT IN
    (
    SELECT  value
    FROM    table_right rt
    )
    
  3. Option

    SELECT  lt.* FROM    table_left lt
    WHERE   NOT EXISTS
    (
    SELECT  NULL
    FROM    table_right rt
    WHERE   rt.value = lt.value
    )
    
各自安好 2024-07-24 06:29:45

警惕“左”连接 - 左连接本质上是外部连接。 不同的 RDBMS 查询解析器和优化器处理 OUTER JOINS 的方式可能非常不同。 举例来说,MySQL 的查询优化器如何解析 LEFT(OUTER)JOINS,以及它们在每次迭代中可以评估的结果执行计划的差异:

https://dev.mysql.com/doc/refman/8.0/en/outer-join-simplification.html

左连接就其本质而言,它们总是非确定性的。 IMO - 它们不应该在生产代码中使用。

我更喜欢首先以更“老派”的方法编写 JOIN 类型语句,省略任何特定的 JOIN 声明。 让 RDBMS 查询解析器执行其设计目的 - 分析您的语句并根据对索引统计信息和数据模型设计的评估将其转换为最佳执行计划。 也就是说,内置的查询解析器/优化器甚至可能会出错,相信我,我已经看到这种情况发生很多次了。 总的来说,我觉得首先采用这种方法通常可以提供足够的基线信息,以便在大多数情况下做出明智的进一步调整决策。

为了说明 - 使用此线程中的问题查询:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'

AND NOT EXISTS (
  SELECT * FROM `elimination`
   WHERE `elimination`.`voter_id` = `voter`.`ID`
)

ORDER BY `Last_Name` ASC
LIMIT 30

考虑在没有上面的显式 JOIN 和 NOT EXISTS 语句的情况下重写它(假设 WHERE 子句中的非完全限定字段属于选民表):

SELECT v.`ID`, v.`Last_Name`, v.`First_Name`,
       v.`Middle_Name`, v.`Age`, v.`Sex`,
       v.`Party`, v.`Demo`, v.`PV`,
       h.`Address`, h.`City`, h.`Zip`
FROM `voter` v, `household` h, `elimination` e
WHERE v.`House_ID` = h.`id`
AND v.`ID` != e.`voter_id`
AND v.`CT` = '5'
AND v.`Precnum` = 'CTY3'
AND  v.`Last_Name`  LIKE '%Cumbee%'
AND  v.`First_Name`  LIKE '%John%'
ORDER BY v.`Last_Name` ASC
LIMIT 30;

尝试编写一些未来的 SQL从语法上继续查询两种方式,比较它们的结果,看看你的想法。 按照我上面建议的风格编写 SQL 还具有与 RDBMS 更加无关的额外好处。

干杯!

Be wary of "LEFT" JOINS - LEFT JOINS are essentially OUTER JOINS. Different RDBMS query parsers and optimizers may handle OUTER JOINS very differently. Take for instance, how LEFT (OUTER) JOINS are parsed by MySQL's query optimizer, and the difference in resulting execution plans they could evaluate to per iteration:

https://dev.mysql.com/doc/refman/8.0/en/outer-join-simplification.html

LEFT JOINS by their very nature are ALWAYS going to be NonDeterministic. IMO - they should not be used in Production code.

I prefer to write JOIN type statements in a more "old school" approach first, leaving out any specific JOIN declarations. Let the RDBMS query parser do what its designed to do - analyze your statement and translate it to most optimal execution plan based on its evaluation of your index stats and data model design. That said, the build in query parsers / optimizers can even get it wrong, trust me I've seen it happen many times. In general, I feel like taking this approach first generally provides sufficient baseline information to make informed further tuning decisions in most cases.

To illustrate - using the question query from this thread:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'

AND NOT EXISTS (
  SELECT * FROM `elimination`
   WHERE `elimination`.`voter_id` = `voter`.`ID`
)

ORDER BY `Last_Name` ASC
LIMIT 30

Consider it re-written without the explicit JOIN and NOT EXISTS statements above (assumes the non fully qualified fields in the WHERE clause belonged to the voter table):

SELECT v.`ID`, v.`Last_Name`, v.`First_Name`,
       v.`Middle_Name`, v.`Age`, v.`Sex`,
       v.`Party`, v.`Demo`, v.`PV`,
       h.`Address`, h.`City`, h.`Zip`
FROM `voter` v, `household` h, `elimination` e
WHERE v.`House_ID` = h.`id`
AND v.`ID` != e.`voter_id`
AND v.`CT` = '5'
AND v.`Precnum` = 'CTY3'
AND  v.`Last_Name`  LIKE '%Cumbee%'
AND  v.`First_Name`  LIKE '%John%'
ORDER BY v.`Last_Name` ASC
LIMIT 30;

Try writing some of your future SQL queries BOTH ways syntactically going forward, compare their results, and see what you think. Writing your SQL in the style I have suggested above comes with the added benefit of being more RDBMS agnostic, also.

Cheers!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文