MySQL查询问题,我无法得到它
我希望有人能帮助我。我有一个记录我们导入作业的表。我需要一个查询,该查询将生成一个矩阵,其中纵轴为表名称,横轴为导入日期,以及矩阵单元格中该日期为该表导入的记录总数。我不在乎我们是否必须创建临时表,但整个事情必须在 MySQL 中完成。
下面是我们的事件日志表的简化示例。它不仅有更多的字段,而且我们导入了更多的表。因此,解决方案应该考虑查询表名。您会注意到,数据每天可以多次导入表中,如记录 5 和 6 所示。
id table_name import_date num_recs
----+-----------+--------------------+-------
0 customer 2010-06-20 00:00:00 10
1 order 2010-06-20 00:00:00 15
2 customer 2010-06-21 00:00:00 5
3 order 2010-06-21 00:00:00 6
4 customer 2010-06-22 00:00:00 1
5 order 2010-06-22 00:00:00 6
6 order 2010-06-22 00:00:00 1
我们正在寻找类似这样的结果。不一定要准确
table_name 06-20 06-21 06-22
------------+-----+-----+------
customer | 10 | 5 | 1
order | 15 | 6 | 7
I hope someone can help me out. I have a table that logs our import jobs. I need a query that will produce a matrix with the names of tables on the vertical axis, the import dates on the horizontal axis, and the total number of records imported for that table on that date in the matrix cell. I don't care if we have to create a temporary table, but the whole thing must be done in MySQL.
Below is a simplified sample of our event log table. Not only does it have many more fieds, but we import many more tables. Therefore, the solution should account for querying the table names. You will notice that data can be imported into a table more than once per day, as in records 5 and 6.
id table_name import_date num_recs
----+-----------+--------------------+-------
0 customer 2010-06-20 00:00:00 10
1 order 2010-06-20 00:00:00 15
2 customer 2010-06-21 00:00:00 5
3 order 2010-06-21 00:00:00 6
4 customer 2010-06-22 00:00:00 1
5 order 2010-06-22 00:00:00 6
6 order 2010-06-22 00:00:00 1
We are looking for a result something like this. It does not have to be exact
table_name 06-20 06-21 06-22
------------+-----+-----+------
customer | 10 | 5 | 1
order | 15 | 6 | 7
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
表单的输出怎么样:
这样您就可以使用简单的
GROUP BY
来完成:否则,您的查询将变得真的令人讨厌。
What about output of the form:
This way you can do it with a simple
GROUP BY
:Otherwise, your query is going go be really nasty.
MySQL 无法执行数据透视查询,但您可以在两个查询中执行此操作,使用第一个查询的结果作为下一个查询的 SQL:
然后执行该查询的输出以获得最终结果,例如对于您的示例,您将get:
您可以编写一个存储过程来连接、准备,然后执行第一个查询的结果,或者,如果这一切都是从 shell 脚本运行,您可以捕获第一个查询的结果,然后提供结果回到mysql。
MySQL can not do pivot queries, but you can do it in two queries, using the result of the first query as the SQL for the next query:
Then execute the output of that query to get your final results, e.g. for your example you would get:
You can either write a stored procedure to concatenate, prepare, and then execute the results of the first query, OR, if this is all run from a shell script, you can capture the results of the first query, then feed the results back into mysql.
我认为 Ben S 走在正确的道路上。我想在这里提供我能提供的一切,以防它对任何人有所帮助,谁知道呢。 原始出处
这里有一个方法可以任意取两个日期并将它们分成时间块,然后对每个块中的其他数据执行一些聚合函数。在您的情况下,该块可能应该是一天,开始日期可能是当天之前的 30 天,结束日期可能是当天。每个块都可以返回一些感兴趣的聚合指标。在您的情况下,这可能是
SUM('imports')
SELECT t1.table_name AS table_name, t1.imports AS 导入 FROM (SELECT SUM(`imports`) AS 导入, CEIL ((UNIX_TIMESTAMP('') - UNIX_TIMESTAMP(`import_date`))/ (<一天,我认为是秒?>) ) AS RANGE FROM `` WHERE `import_date` BETWEEN '' AND '' GROUP BY RANGE ORDER BY RANGE DESC) AS t1;
这可能根本没有帮助,但如果有的话那就太好了。它可以轻松修改为以日期列的形式返回每个范围的开始日期。需要明确的是,这与 Ben S 的解决方案提供的功能完全相同,但如果您的所有日期不是 00:00:00,它就会起作用,而这会导致他在日期列上出现
GROUP BY
要查看返回结果,请参阅 Ben S 的答案并在心里删除日期列。然而,正如我所说,该列可以轻松地添加回该查询中。 FWIW,我已经在超过 400 万行的表上使用了这种方法,并且它仍然在 << 中运行。 1 秒,这对于我的目的来说已经足够了。
哈米
I think Ben S is on the right track. I wanted to offer what I could here in case it helps anyone, who knows. Original source
Here is a method to take two arbitrary dates and split them apart into blocks of time, and then performs some aggregation function on other data in each block. In your case, that block should probably be a single day, the start date would likely be 30 days prior to the current day, and the end date would likely be the current day. Each block can be returned with some aggregate metric of interest. In your case, this will likely be the
SUM('imports')
SELECT t1.table_name AS table_name, t1.imports AS imports FROM (SELECT SUM(`imports`) AS imports, CEIL( (UNIX_TIMESTAMP('<now>') - UNIX_TIMESTAMP(`import_date`))/ (<one day in ?seconds, i think?>) ) AS RANGE FROM `<your table>` WHERE `import_date` BETWEEN '<now minus 30 days>' AND '<now>' GROUP BY RANGE ORDER BY RANGE DESC) AS t1;
This might not help at all, but if it does then goody. It's easily modified to return the starting day for each range as a date column. To be clear, this does the exact same thing that Ben S's solution offers, but it will work if all of your dates are not 00:00:00 whereas that would cause his
GROUP BY
on the date column to failTo see what the return would look like, see Ben S's answer and mentally remove the date column. As I said however, that column could easily be added back into this query. FWIW, I have used this method on tables with upwards of 4 million rows and it still runs in < 1 second, which was good enough for my purposes.
Hamy