如何通过连接正确计算 SUM？

发布于 2024-12-02 03:00:44 字数 2082 浏览 5 评论 0原文

因此，我试图计算零件数量、任务数量、每项工作的数量以及制造每项工作所需的时间，但我得到了一些奇怪的结果。如果我运行这个：

SELECT
  j.id, 
    mf.special_instructions,
  count(distinct p.id) as number_of_different_parts,
  count(distinct t.id) as number_of_tasks,
  SUM(distinct j.quantity) as number_of_assemblies,
  SUM(l.time_elapsed) as time_elapsed

FROM
  sugarcrm2.mf_job mf
INNER JOIN ramses.jobs j on
  mf.id = j.mf_job_id
INNER JOIN ramses.parts p on
  j.id = p.job_id
INNER JOIN ramses.tasks t on
  p.id = t.part_id
INNER JOIN ramses.batch_log l on
  t.batch_id = l.batch_id

WHERE 
  mf.job_description                LIKE "%BACKBLAZE%" OR
  mf.customer_name                  LIKE "%BACKBLAZE%" OR
  mf.customer_ref                   LIKE "%BACKBLAZE%" OR
  mf.technical_company_name LIKE "%BACKBLAZE%" OR
  mf.description                        LIKE "%BACKBLAZE%" OR
  mf.name                                   LIKE "%BACKBLAZE%" OR
  mf.enclosure_style                LIKE "%BACKBLAZE%" OR 
    mf.special_instructions     LIKE "%BACKBLAZE%"
Group by j.id

我现在得到准确的零件和任务编号，但 time_elapsed 总和不正确。问题可能是什么？

当我尝试使用 distinct 时，我得到了一个非常低的数字（就像当我寻找接近 10,000 的数字时，数字在 1 到 30 之间。）

更新：这是创建代码：

http://pastebin.com/nbhU9rYh

http://pastebin.com/tdmAkNr4

http://pastebin.com/0TFCUaeQ

http://pastebin.com/fugr8C9U

http://pastebin.com/Zq0bKG2L< /a>

http://pastebin.com/k5rESUrq

关系如下：

mf_job 信息链接到一个作业
作业有零件
零件有任务
任务
是批量的。批量任务的启动和停止，它有一个 start_time 和一个 stop_time 以及一个 time_elapsed 。

我试图从每个 mf_job 的 batch_log 中获取所有 time_elapsed ，其中一个字段中包含单词 backblaze 以及零件、任务和组件的数量。这一切都需要按 job.id 或 mf_job.id 分组

原文

So I'm trying to count the number of parts, number of tasks, the quantity in each job and the time that it took to manufacture each job but I'm getting some funky results. If I run this:

SELECT
  j.id, 
    mf.special_instructions,
  count(distinct p.id) as number_of_different_parts,
  count(distinct t.id) as number_of_tasks,
  SUM(distinct j.quantity) as number_of_assemblies,
  SUM(l.time_elapsed) as time_elapsed

FROM
  sugarcrm2.mf_job mf
INNER JOIN ramses.jobs j on
  mf.id = j.mf_job_id
INNER JOIN ramses.parts p on
  j.id = p.job_id
INNER JOIN ramses.tasks t on
  p.id = t.part_id
INNER JOIN ramses.batch_log l on
  t.batch_id = l.batch_id

WHERE 
  mf.job_description                LIKE "%BACKBLAZE%" OR
  mf.customer_name                  LIKE "%BACKBLAZE%" OR
  mf.customer_ref                   LIKE "%BACKBLAZE%" OR
  mf.technical_company_name LIKE "%BACKBLAZE%" OR
  mf.description                        LIKE "%BACKBLAZE%" OR
  mf.name                                   LIKE "%BACKBLAZE%" OR
  mf.enclosure_style                LIKE "%BACKBLAZE%" OR 
    mf.special_instructions     LIKE "%BACKBLAZE%"
Group by j.id

and I now get accurate parts and tasks numbers but the time_elapsed sum isn't correct. What could the problem be?

When I try it with distinct I get a veeeeery low number (like something between 1 and 30 when I'm looking for something closer to 10,000.)

UPDATE: here is the create code:

http://pastebin.com/nbhU9rYh

http://pastebin.com/tdmAkNr4

http://pastebin.com/0TFCUaeQ

http://pastebin.com/fugr8C9U

http://pastebin.com/Zq0bKG2L

http://pastebin.com/k5rESUrq

The relationships are like this:

mf_job info is linked to a job
jobs have parts
parts have tasks
tasks are in batches
batch_log is a table with all of the starts and stops for the batches of tasks, it has a start_time and a stop_time and a time_elapsed.

I am trying to get all of the time_elapsed from the batch_log for each mf_job with the word backblaze in one of it's fields along with the number of parts, tasks and assemblies. This all needs to be grouped by job.id or mf_job.id

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

攒眉千度 2024-12-09 03:00:44

尝试将查询重写为：

SELECT
  j.id, 
  mf.special_instructions,
  count(p.id) as number_of_different_parts,
  count(t.id) as number_of_tasks,
  SUM(j.quantity) as number_of_assemblies,
  SEC_TO_TIME(SUM(l.seconds_elapsed)) as time_elapsed

FROM
  sugarcrm2.mf_job mf
INNER JOIN ramses.jobs j on
  mf.id = j.mf_job_id
INNER JOIN ramses.parts p on
  j.id = p.job_id
INNER JOIN ramses.tasks t on
  p.id = t.part_id
INNER JOIN (
            SELECT rl.batch_id
                  , SUM(TIME_TO_SEC(rl.time_elapsed)) as seconds_elapsed
            FROM ramses.batch_log rl 
            GROUP BY rl.batch_id
            ) l ON (t.batch_id = l.batch_id)

WHERE 
  mf.job_description                LIKE "%BACKBLAZE%" OR
  mf.customer_name                  LIKE "%BACKBLAZE%" OR
  mf.customer_ref                   LIKE "%BACKBLAZE%" OR
  mf.technical_company_name         LIKE "%BACKBLAZE%" OR
  mf.description                    LIKE "%BACKBLAZE%" OR
  mf.name                           LIKE "%BACKBLAZE%" OR
  mf.enclosure_style                LIKE "%BACKBLAZE%" OR 
  mf.special_instructions           LIKE "%BACKBLAZE%"
GROUP BY j.id WITH ROLLUP

Try and rewrite the query to:

SELECT
  j.id, 
  mf.special_instructions,
  count(p.id) as number_of_different_parts,
  count(t.id) as number_of_tasks,
  SUM(j.quantity) as number_of_assemblies,
  SEC_TO_TIME(SUM(l.seconds_elapsed)) as time_elapsed

FROM
  sugarcrm2.mf_job mf
INNER JOIN ramses.jobs j on
  mf.id = j.mf_job_id
INNER JOIN ramses.parts p on
  j.id = p.job_id
INNER JOIN ramses.tasks t on
  p.id = t.part_id
INNER JOIN (
            SELECT rl.batch_id
                  , SUM(TIME_TO_SEC(rl.time_elapsed)) as seconds_elapsed
            FROM ramses.batch_log rl 
            GROUP BY rl.batch_id
            ) l ON (t.batch_id = l.batch_id)

WHERE 
  mf.job_description                LIKE "%BACKBLAZE%" OR
  mf.customer_name                  LIKE "%BACKBLAZE%" OR
  mf.customer_ref                   LIKE "%BACKBLAZE%" OR
  mf.technical_company_name         LIKE "%BACKBLAZE%" OR
  mf.description                    LIKE "%BACKBLAZE%" OR
  mf.name                           LIKE "%BACKBLAZE%" OR
  mf.enclosure_style                LIKE "%BACKBLAZE%" OR 
  mf.special_instructions           LIKE "%BACKBLAZE%"
GROUP BY j.id WITH ROLLUP

回复收藏 0 原文

梦一生花开无言 2024-12-09 03:00:44

您需要将查询更改为：

SELECT
  ...
  SEC_TO_TIME(SUM(TIME_TO_SEC(l.time_elapsed))) as time_elapsed

另外，LIKE '%...' 行将使查询变得非常慢，因为无法使用此查询。

如果您能够使用 MyISAM，则可以在这些列上使用全文索引并使用如下代码：

WHERE MATCH(mf.job_description,mf.customer_name,mf.customer_name,...) 
      AGAINST ('BACKBLAZE' IN NATURAL LANGUAGE MODE)

请参阅：
http://dev.mysql.com/doc/refman/5.5 /en/fulltext-search.html
http://www.petefreitag.com/item/477.cfm
http://dev .mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_time-to-sec

You need to change the query to:

SELECT
  ...
  SEC_TO_TIME(SUM(TIME_TO_SEC(l.time_elapsed))) as time_elapsed

Also, the line of LIKE '%...' will make the query uber slow, because no indexes on this can be used.

If you are able to use MyISAM, you can use a fulltext index on those columns and use code like:

WHERE MATCH(mf.job_description,mf.customer_name,mf.customer_name,...) 
      AGAINST ('BACKBLAZE' IN NATURAL LANGUAGE MODE)

See:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html
http://www.petefreitag.com/item/477.cfm
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-functions.html#function_time-to-sec

回复收藏 0 原文

↙厌世 2024-12-09 03:00:44

听起来问题在于多个任务可以在同一批次中，和/或多个部分可以在同一任务中。举例来说，您的工作有 3 个部分，每个部分都有一个任务，并且所有 3 个任务都在同一个批次中。您需要将该批次的时间添加三次。但不同的也不起作用，因为如果您有 5 个不同的批次，并且全部花费了 300 秒，那么它们将不会被视为不同。

在这种情况下，通常可以使用子查询。您可以使用选择不同 j.id（或 p.job_id）、的子查询来连接，而不是直接使用 batch_log 来连接l.batch_id 和 l.time_elapsed （第一个用于连接，第二个用于正确计算不同的值，第三个是要使用的实际值）。然后您可以从那里求和l.time_elapsed。这样每个批次都会被精确计数一次。