非常复杂的 Group By / Unique / Limit by SQL 命令

发布于 2024-09-28 18:41:53 字数 1292 浏览 5 评论 0原文

我实际上什至不知道如何称呼它:P,但是...

我有一个表,让我们称之为“上传”

id    owner    date
-----------------------------
0     foo      20100101120000
1     bar      20100101120300
2     foo      20100101120400
3     bar      20100101120600
..    ..       ..
6     foo      20100101120800

现在,当我做类似的事情时:

SELECT id FROM uploads ORDER BY date DESC

这将导致:

id    owner    date
-----------------------------
6     foo      20100101120800
..    ..       ..
3     bar      20100101120600
2     foo      20100101120400
1     bar      20100101120300
0     foo      20100101120000

问题:< /strong> 很好,但是,我想走得更远。因为现在,当您构建时间线时(我就是这样做的:P),您会收到“垃圾邮件”,消息说 foo 和 bar 上传了某些内容。我想将它们分组并在日期字段中返回时间限制为“500”的第一个结果。

我需要哪种 SQL 命令会导致:

id    owner    date
-----------------------------
6     foo      20100101120800
3     bar      20100101120600
0     foo      20100101120000

然后,之后,我可以对每条记录执行调用,以在 5 分钟的时间范围内获取关联记录(这是 id=6 的示例)

SELECT id FROM uploads WHERE date>=20100101120800-500 ORDER BY date DESC

:现在有人我应该怎么做第一步? (因此限制/分组结果)

(顺便说一句。我知道当我想使用它时,我应该将每个日期(YmdHis=60)转换为Unix时间(=100),但我不需要 5分钟正好是5分钟,有时可能会少一分钟......)

I actually don't even know how to call this :P, but...

I have one table, let's call it "uploads"

id    owner    date
-----------------------------
0     foo      20100101120000
1     bar      20100101120300
2     foo      20100101120400
3     bar      20100101120600
..    ..       ..
6     foo      20100101120800

Now, when I'ld do something like:

SELECT id FROM uploads ORDER BY date DESC

This would result in:

id    owner    date
-----------------------------
6     foo      20100101120800
..    ..       ..
3     bar      20100101120600
2     foo      20100101120400
1     bar      20100101120300
0     foo      20100101120000

Question: Nice, but, I want to go even further. Because now, when you would build a timeline (and I did :P), you are 'spammed' by messages saying foo and bar uploaded something. I'ld like to group them and return the first result with a time-limit of '500' at the date-field.

What kind of SQL-command do I need that would result in:

id    owner    date
-----------------------------
6     foo      20100101120800
3     bar      20100101120600
0     foo      20100101120000

Then, after that, I can perform a call for each record to get the associative records in a timeframe of 5 minutes (this is an exmaple for id=6):

SELECT id FROM uploads WHERE date>=20100101120800-500 ORDER BY date DESC

Does anyone now how I should do the first step? (so limiting/grouping the results)

(btw. I know that when I want to use this, I should convert every date (YmdHis=60) to Unix-time (=100), but I don't need the 5 minutes to be exactly 5 minutes, they may be a minute less sometimes...)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

守不住的情 2024-10-05 18:41:53

即使有你的例子,我也不太清楚你想要得到的结果。也许有舍入和分组依据的东西。

SELECT max(id) max_id,owner, (ROUND(date/500)*500) date_interval, max(date) date
FROM uploads GROUP BY date_interval,owner

您可能想使用 FLOOR 或 CEILING 而不是 ROUND,具体取决于您的需要。

I'm not quite clear on the result you are trying to get, even with your examples. Perhaps something with rounding and group by.

SELECT max(id) max_id,owner, (ROUND(date/500)*500) date_interval, max(date) date
FROM uploads GROUP BY date_interval,owner

You may want to use FLOOR or CEILING instead of ROUND, depending on what you want.

冬天旳寂寞 2024-10-05 18:41:53

标准 SQL 不能很好地处理间隔。
您将需要对表进行自连接来比较不同元组的日期。
这样,您就可以轻松找到日期相距不超过 500 的所有元组对。
但是,您确实希望将日期聚集在间隔不超过 500 的集合中 - 据我所知,这根本无法用 SQL 表达。

您可以做的事情非常相似:将总时间间隔拆分为固定的 500 个单位范围,然后根据它们所在的间隔对表中的所有元组进行聚类。为此,您首先需要一个表或查询结果间隔的开始时间;这可以使用表上的 SQL 查询和一个函数来创建,该函数可以将时间戳“四舍五入”到其间隔中的开始时间,或者计算其间隔序列号。然后,作为第二步,您可以将表与该结果连接起来,根据相应的开始时间对其时间戳进行分组。我无法提供 SQL,因为它依赖于 DBMS,而且我当然无法告诉您这是否是在您的情况下完成您想要的操作的最佳方式。

Standard SQL doesn't deal with intervals very well.
You are going to need to do a self-join of the table to compare dates of different tuples.
That way, you can easily find all pairs of tuples of which the dates are no more than 500 apart.
However, you really want to cluster the dates in sets no more than 500 apart - and that can't be expressed in SQL at all, as far as I know.

What you can do is something quite similar: split the total time interval into fixed 500-unit ranges, and then cluster all tuples in the table based on the interval they're in. For that, you first need a table or query result with the start times of the intervals; this can be created using a SQL query on your table and a function that either "rounds off" a timestamp to the starting time in its interval, or computes its interval sequence number. Then as a second step you can join the table with that result to group its timestamps according to their corresponding start time. I can't give the SQL because it's DBMS-dependent, and I certainly can't tell you if this is the best way of accomplishing what you want in your situation.

我还不会笑 2024-10-05 18:41:53

使用内联视图?例如,类似的内容

SELECT u1.* 
FROM uploads u1,
(SELECT date 
    FROM uploads u2
    WHERE u2.owner='foo') datum_points
WHERE u1.date BETWEEN datum_points.date
    AND DATE_ADD(datum_points.date INTERVAL 5 MINUTES)

应该返回“foo”发帖后 5 分钟内发布的所有帖子。

Use an inline view? e.g. something like

SELECT u1.* 
FROM uploads u1,
(SELECT date 
    FROM uploads u2
    WHERE u2.owner='foo') datum_points
WHERE u1.date BETWEEN datum_points.date
    AND DATE_ADD(datum_points.date INTERVAL 5 MINUTES)

should return all the posts made within 5 minutes of 'foo' making a post.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文