非常复杂的 Group By / Unique / Limit by SQL 命令
我实际上什至不知道如何称呼它:P,但是...
我有一个表,让我们称之为“上传”
id owner date
-----------------------------
0 foo 20100101120000
1 bar 20100101120300
2 foo 20100101120400
3 bar 20100101120600
.. .. ..
6 foo 20100101120800
现在,当我做类似的事情时:
SELECT id FROM uploads ORDER BY date DESC
这将导致:
id owner date
-----------------------------
6 foo 20100101120800
.. .. ..
3 bar 20100101120600
2 foo 20100101120400
1 bar 20100101120300
0 foo 20100101120000
问题:< /strong> 很好,但是,我想走得更远。因为现在,当您构建时间线时(我就是这样做的:P),您会收到“垃圾邮件”,消息说 foo 和 bar 上传了某些内容。我想将它们分组并在日期字段中返回时间限制为“500”的第一个结果。
我需要哪种 SQL 命令会导致:
id owner date
-----------------------------
6 foo 20100101120800
3 bar 20100101120600
0 foo 20100101120000
然后,之后,我可以对每条记录执行调用,以在 5 分钟的时间范围内获取关联记录(这是 id=6 的示例)
SELECT id FROM uploads WHERE date>=20100101120800-500 ORDER BY date DESC
:现在有人我应该怎么做第一步? (因此限制/分组结果)
(顺便说一句。我知道当我想使用它时,我应该将每个日期(YmdHis=60)转换为Unix时间(=100),但我不需要 5分钟正好是5分钟,有时可能会少一分钟......)
I actually don't even know how to call this :P, but...
I have one table, let's call it "uploads"
id owner date
-----------------------------
0 foo 20100101120000
1 bar 20100101120300
2 foo 20100101120400
3 bar 20100101120600
.. .. ..
6 foo 20100101120800
Now, when I'ld do something like:
SELECT id FROM uploads ORDER BY date DESC
This would result in:
id owner date
-----------------------------
6 foo 20100101120800
.. .. ..
3 bar 20100101120600
2 foo 20100101120400
1 bar 20100101120300
0 foo 20100101120000
Question: Nice, but, I want to go even further. Because now, when you would build a timeline (and I did :P), you are 'spammed' by messages saying foo and bar uploaded something. I'ld like to group them and return the first result with a time-limit of '500' at the date-field.
What kind of SQL-command do I need that would result in:
id owner date
-----------------------------
6 foo 20100101120800
3 bar 20100101120600
0 foo 20100101120000
Then, after that, I can perform a call for each record to get the associative records in a timeframe of 5 minutes (this is an exmaple for id=6):
SELECT id FROM uploads WHERE date>=20100101120800-500 ORDER BY date DESC
Does anyone now how I should do the first step? (so limiting/grouping the results)
(btw. I know that when I want to use this, I should convert every date (YmdHis=60) to Unix-time (=100), but I don't need the 5 minutes to be exactly 5 minutes, they may be a minute less sometimes...)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
即使有你的例子,我也不太清楚你想要得到的结果。也许有舍入和分组依据的东西。
您可能想使用 FLOOR 或 CEILING 而不是 ROUND,具体取决于您的需要。
I'm not quite clear on the result you are trying to get, even with your examples. Perhaps something with rounding and group by.
You may want to use FLOOR or CEILING instead of ROUND, depending on what you want.
标准 SQL 不能很好地处理间隔。
您将需要对表进行自连接来比较不同元组的日期。
这样,您就可以轻松找到日期相距不超过 500 的所有元组对。
但是,您确实希望将日期聚集在间隔不超过 500 的集合中 - 据我所知,这根本无法用 SQL 表达。
您可以做的事情非常相似:将总时间间隔拆分为固定的 500 个单位范围,然后根据它们所在的间隔对表中的所有元组进行聚类。为此,您首先需要一个表或查询结果间隔的开始时间;这可以使用表上的 SQL 查询和一个函数来创建,该函数可以将时间戳“四舍五入”到其间隔中的开始时间,或者计算其间隔序列号。然后,作为第二步,您可以将表与该结果连接起来,根据相应的开始时间对其时间戳进行分组。我无法提供 SQL,因为它依赖于 DBMS,而且我当然无法告诉您这是否是在您的情况下完成您想要的操作的最佳方式。
Standard SQL doesn't deal with intervals very well.
You are going to need to do a self-join of the table to compare dates of different tuples.
That way, you can easily find all pairs of tuples of which the dates are no more than 500 apart.
However, you really want to cluster the dates in sets no more than 500 apart - and that can't be expressed in SQL at all, as far as I know.
What you can do is something quite similar: split the total time interval into fixed 500-unit ranges, and then cluster all tuples in the table based on the interval they're in. For that, you first need a table or query result with the start times of the intervals; this can be created using a SQL query on your table and a function that either "rounds off" a timestamp to the starting time in its interval, or computes its interval sequence number. Then as a second step you can join the table with that result to group its timestamps according to their corresponding start time. I can't give the SQL because it's DBMS-dependent, and I certainly can't tell you if this is the best way of accomplishing what you want in your situation.
使用内联视图?例如,类似的内容
应该返回“foo”发帖后 5 分钟内发布的所有帖子。
Use an inline view? e.g. something like
should return all the posts made within 5 minutes of 'foo' making a post.