mongodb 中按日期分组
我正在开发一个项目,在该项目中我正在跟踪某个主题的点击次数。
我正在使用 mongodb,我必须按日期对点击次数进行分组(我想对 15 天的数据进行分组)。
我在 mongodb 中有以下格式的数据存储,
{
"_id" : ObjectId("4d663451d1e7242c4b68e000"),
"date" : "Mon Dec 27 2010 18:51:22 GMT+0000 (UTC)",
"topic" : "abc",
"time" : "18:51:22"
}
{
"_id" : ObjectId("4d6634514cb5cb2c4b69e000"),
"date" : "Mon Dec 27 2010 18:51:23 GMT+0000 (UTC)",
"topic" : "bce",
"time" : "18:51:23"
}
我想对主题的点击次数进行分组:abc 按天(15 天)..我知道如何对其进行分组,但如何按存储在我的数据库中的日期进行分组
我我正在寻找以下格式的结果
[
{
"date" : "date in log",
"click" : 9
},
{
"date" : "date in log",
"click" : 19
},
]
我已经编写了代码,但仅当日期为字符串时它才有效(代码位于此处 http:// astebin.com/2wm1n1ix) ...请指导我如何对其进行分组
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
这个问题已经有很多答案,但我对其中任何一个都不满意。 MongoDB 多年来已经得到了改进,现在有更简单的方法可以做到这一点。 Jonas Tomanga 的答案是正确的,但有点太复杂了。
如果您使用的是 MongoDB 3.0 或更高版本,则可以按以下方式按日期分组。我从
$match
聚合开始,因为作者还问了如何限制结果。There are already many answers to this question, but I wasn't happy with any of them. MongoDB has improved over the years, and there are now easier ways to do it. The answer by Jonas Tomanga gets it right, but is a bit too complex.
If you are using MongoDB 3.0 or later, here's how you can group by date. I start with the
$match
aggregation because the author also asked how to limit the results.在mongodb中按日期获取数据
To fetch data group by date in mongodb
迟到的答案,但为了记录(对于来到此页面的其他人):您需要使用“keyf”参数而不是“key”,因为您的密钥实际上将是日期的函数事件(即从日期中提取的“日”)而不是日期本身。这应该可以满足您的需求:
有关更多信息,请查看有关聚合和组的 MongoDB 文档页面:http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
Late answer, but for the record (for anyone else that comes to this page): You'll need to use the 'keyf' argument instead of 'key', since your key is actually going to be a function of the date on the event (i.e. the "day" extracted from the date) and not the date itself. This should do what you're looking for:
For more information, take a look at MongoDB's doc page on aggregation and group: http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group
这可以帮助
This can help
还没有使用 MongoDB 那么多,所以我不太确定。但是你不能使用完整的 Javascript 吗?
因此,您可以使用 Javascript
Date
类解析您的日期,创建当天的日期,并将其设置为“out”属性的键。如果该键已存在,则始终添加一个,否则使用 value = 1 创建新的(第一次单击)。下面是带有改编后的reduce函数的代码(未经测试的代码!):Haven't worked that much with MongoDB yet, so I am not completely sure. But aren't you able to use full Javascript?
So you could parse your date with Javascript
Date
class, create your date for the day out of it and set as key into an "out" property. And always add one if the key already exists, otherwise create it new with value = 1 (first click). Below is your code with adapted reduce function (untested code!):感谢@mindthief,你的回答帮助解决了我今天的问题。下面的功能可以更容易地按天分组,希望可以帮助其他人。
thanks for @mindthief, your answer help solve my problem today. The function below can group by day a little more easier, hope can help the others.
另一个迟到的答案,但仍然如此。因此,如果您只想在一次迭代中完成此操作并获取按日期和主题分组的点击次数,您可以使用以下代码:
另外,如果您想按照建议优化查询,您可以使用日期的整数值(提示:使用 valueOf() 作为关键日期而不是字符串,尽管对于我的示例来说速度是相同的,
而且定期检查 MongoDB 文档总是明智的,因为它们一直在添加新功能,例如新功能。聚合框架,将在2.2版本中发布,您可以更轻松地实现相同的结果 http:// /docs.mongodb.org/manual/applications/aggregation/
Another late answer, but still. So if you wanna do it in only one iteration and get the number of clicks grouped by date and topic you can use the following code:
Also If you would like to optimize the query as suggested you can use an integer value for date (hint: use valueOf(), for the key date instead of the String, though for my examples the speed was the same.
Furthermore it's always wise to check the MongoDB docs regularly, because they keep adding new features all the time. For example with the new Aggregation framework, which will be released in the 2.2 version you can achieve the same results much easier http://docs.mongodb.org/manual/applications/aggregation/
如果您想要直接返回 Date oject
然后而不是应用 < a href="https://docs.mongodb.org/manual/reference/operator/aggregation-date/" rel="nofollow">日期聚合运算符,而是应用“日期数学”对日期对象进行舍入。这通常是可取的,因为所有驱动程序都以通常用于所有可能的语言的日期操作的形式表示 BSON 日期:
或者如果问题中暗示所需的分组间隔是 15 天的“桶” ,然后只需将其应用于
$mod
中的数值:应用的基本数学是当您
$subtract
两个Date
对象返回的结果将是数值上相差的毫秒数。因此纪元由Date(0)
表示,作为您拥有的任何语言构造函数中转换的基础。对于数值,“模”(
$mod
)用于将日期四舍五入(从除法中减去余数)到所需的间隔。是:或
因此,它可以灵活地满足您需要的任何间隔。
同样的道理,上面的
$add
Date 对象之间的操作将返回一个Date
对象,该对象等于两个对象组合的毫秒值(纪元为 0,因此 0 加上差值)是转换后的日期)。在以下列表中可以轻松表示和重现:
并以 15 天的间隔运行第二个示例:
或类似的分布,具体取决于列表运行时的当前日期,当然,自纪元日期以来,15 天的间隔将保持一致。
使用“数学”方法更容易调整,特别是如果您想在聚合输出中调整不同时区的时间段,您可以通过添加/减去与 UTC 的数字差异来进行类似的数字调整。
If You want a Date oject returned directly
Then instead of applying the Date Aggregation Operators, instead apply "Date Math" to round the date object. This can often be desirable as all drivers represent a BSON Date in a form that is commonly used for Date manipulation for all languages where that is possible:
Or if as is implied in the question that the grouping interval required is "buckets" of 15 days, then simply apply that to the numeric value in
$mod
:The basic math applied is that when you
$subtract
twoDate
objects the result returned will be the milliseconds of differnce numerically. So epoch is represented byDate(0)
as the base for conversion in whatever language constructor you have.With a numeric value, the "modulo" (
$mod
) is applied to round the date ( subtract the remainder from the division ) to the required interval. Being either:Or
So it's flexible to whatever interval you require.
By the same token from above an
$add
operation between a "numeric" value and aDate
object will return aDate
object equivalent to the millseconds value of both objects combined ( epoch is 0, therefore 0 plus difference is the converted date ).Easily represented and reproducible in the following listing:
And running the second example with 15 day intervals:
Or similar distribution depending on the current date when the listing is run, and of course the 15 day intervals will be consistent since the epoch date.
Using the "Math" method is a bit easier to tune, especially if you want to adjust time periods for different timezones in aggregation output where you can similarly numerically adjust by adding/subtracting the numeric difference from UTC.
当然,这是一个很好的解决方案。除此之外,您可以按天将日期分组为字符串(如该答案建议的那样),或者您可以通过以下方式获取日期的开始时间像这样投影日期字段(聚合):
它给你这个:
它有一些优点:你可以在日期类型(不是数字或字符串)中操作你的日期,它允许你使用所有的 日期聚合运算符 在以下聚合操作中,并为您提供输出的日期类型。
Of course, that is a good solution. Aside from that you can group dates by days as strings (as that answer propose) or you can get the beginning of dates by projecting date field (in aggregation) like that:
It gives you this:
It has some pluses: you can manipulate with your days in date type (not number or string), it allows you to use all of the date aggregation operators in following aggregation operations and gives you date type on the output.
使用 Mongo 聚合框架的新答案
在提出并回答这个问题后,10gen 发布了带有聚合框架的 Mongodb 版本 2.2,这是现在执行此类查询的更好方法。这个查询有点具有挑战性,因为您想要按日期分组并且存储的值是时间戳,因此您必须执行一些操作将时间戳转换为匹配的日期。为了示例的目的,我将只编写一个获得正确计数的查询。
这将返回类似以下内容:
您需要使用
$match
将查询限制为您感兴趣的日期范围,并使用$project
重命名_id
到日期
。如何将一年中的某一天转换回日期留给读者作为练习。 :-)10gen 有一个方便的SQL 到 Mongo Aggregation 转换图表值得参考书签。还有一篇关于日期聚合运算符的具体文章。
更有趣的是,您可以使用:
它将获取最近 15 天的数据,并在
date
字段中返回每天的一些日期时间。例如:New answer using Mongo aggregation framework
After this question was asked and answered, 10gen released Mongodb version 2.2 with an aggregation framework, which is now the better way to do this sort of query. This query is a little challenging because you want to group by date and the values stored are timestamps, so you have to do something to convert the timestamps to dates that match. For the purposes of example I will just write a query that gets the right counts.
This will return something like:
You need to use
$match
to limit the query to the date range you are interested in and$project
to rename_id
todate
. How you convert the day of year back to a date is left as an exercise for the reader. :-)10gen has a handy SQL to Mongo Aggregation conversion chart worth bookmarking. There is also a specific article on date aggregation operators.
Getting a little fancier, you can use:
which will get you the latest 15 days and return some datetime within each day in the
date
field. For example: