CouchDB 建模 - 时间过滤和分组数据

发布于 2025-01-04 05:42:12 字数 1284 浏览 1 评论 0原文

我正在尝试加深对 CouchDB 以及如何为某些现实场景进行数据建模的理解。我现在已经尽可能多地“按日期获取博客文章”了;)

给定这样的文档:

{
    "_id": "couch1",
    "_rev": "2-338d0a592ad1e5570000002b00000000",
    "eventType": "event1",
    "date": 1328805860000
}

{
    "_id": "couch2",
    "_rev": "1-1e0315c2e1ca7f5f0000002b00000000",
    "eventType": "event1",
    "date": 1328133600000
}

{
    "_id": "couch3",
    "_rev": "1-154cd416b78cb2ef0000002b00000000",
    "eventType": "event2",
    "date": 1325434920000
}

如果日期是纪元,是否可以要求 Couch 在您要求的位置查看两个时间戳之间发生的所有“事件”,然后按“事件类型”对数据进行分组?

因此,使用上述内容并假设传入的时间戳包含这些文档 - 我们希望看到输出:

"event1": 2
"event2": 1

进一步我有的信息获得

我知道 Couch 会按键排序,所以如果我想要“前 10 名”,那么这将是第二阶段,但我可以处理。

所以这里的核心问题是您按一列进行过滤,然后按另一列进行分组?

如果我们使用以下映射函数:

function (doc) {
  emit([doc.date, doc.eventType], doc.eventType);
}

使用countreduce函数,我们会看到,因为时间戳本质上是唯一的,所以Couch无法分组并且键的值为1。

因此您可以将映射函数更改为以下内容:

function (doc) {
  emit([doc.eventType, doc.date], doc.eventType);
}

然后将组级别更改为 1,这将按事件正确分组,但您的数据无法按时间切片,因为您的主要排序是按事件名称排序,这意味着时间排序现在被破坏了?

人们有关于这方面的战争故事吗?这是否需要通过重新减少来完成?

非常感谢任何花时间阅读这篇

Eggsy 的人

I'm trying to build up my understanding of the CouchDB and how to model data for some real world scenarios. I've done as many 'get me blog posts by date' as I can for now ;)

Given documents like so:

{
    "_id": "couch1",
    "_rev": "2-338d0a592ad1e5570000002b00000000",
    "eventType": "event1",
    "date": 1328805860000
}

{
    "_id": "couch2",
    "_rev": "1-1e0315c2e1ca7f5f0000002b00000000",
    "eventType": "event1",
    "date": 1328133600000
}

{
    "_id": "couch3",
    "_rev": "1-154cd416b78cb2ef0000002b00000000",
    "eventType": "event2",
    "date": 1325434920000
}

Where the date is an epoch would it be possible to ask Couch to make a view where you asked for all "events" that happened betweem two timestamps and then group that data by the "eventType"?

So using the above and assuming the timestamps passed in encompass those documents - we'd want to see output:

"event1": 2
"event2": 1

Further info I have obtained

I'm aware that Couch will sort by key so if I wanted a 'top 10' then that would be a second phase but I can handle that.

So the core problem here is that you are filtering by one column but then grouping by another?

If we use the following map function:

function (doc) {
  emit([doc.date, doc.eventType], doc.eventType);
}

with a count reduce function we see that because the timestamps are essentially unique Couch cannot group and key has the value 1.

So you can change the map function to the following:

function (doc) {
  emit([doc.eventType, doc.date], doc.eventType);
}

And then change the group level to 1 which will group correctly by event but your data cannot then be sliced by time because your primary ordering is by the event name, meaning that time ordering is now broken?

Do people have any war stories on this? Does this need to be done with re-reduce?

Many thanks in advance to anyone taking time to read this

Eggsy

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

别把无礼当个性 2025-01-11 05:42:12

我建议使用视图/列表组合:

视图:

"eventByDate": 
{
  "map": "function(doc) { emit(doc.date, doc.eventType);}"
}

列表:

"test": "function(head,req) {
  var eventO=new Object();
  while(row=getRow()) {
  if(eventO[row.value]==undefined) {
    eventO[row.value]=1;
  }else{
    eventO[row.value]++;
  }
 }
 send("[");
 for (var curEvent in eventO) {
  send ("{\"event\":\""+curEvent +"\",\"count\":"+eventO[curEvent]+"}");
 }
 send("]");
}"

结果:

[
{"event":"event2","count":1} 
{"event":"event1","count":2} 
]

但是您必须在此处或后端手动按计数排序(我没有实现它)

I would suggest a view/list Combo:

View:

"eventByDate": 
{
  "map": "function(doc) { emit(doc.date, doc.eventType);}"
}

List:

"test": "function(head,req) {
  var eventO=new Object();
  while(row=getRow()) {
  if(eventO[row.value]==undefined) {
    eventO[row.value]=1;
  }else{
    eventO[row.value]++;
  }
 }
 send("[");
 for (var curEvent in eventO) {
  send ("{\"event\":\""+curEvent +"\",\"count\":"+eventO[curEvent]+"}");
 }
 send("]");
}"

Result:

[
{"event":"event2","count":1} 
{"event":"event1","count":2} 
]

But you have to order by count manually (i did not implement it) here or in your backend

ぽ尐不点ル 2025-01-11 05:42:12

您有固定数量的事件类型吗?它是一个小而相对静态的列表吗?

如果没有,请跳过我的其余答案。

如果是这样,请继续阅读快速而肮脏的选择。

您可以根据事件类型值更改您的map.js 函数以具有多个emit() 函数。

if(eventType == event1 ) {emit(doc.date, {'eventType1': 1} 

对每种事件类型重复此操作。或者,如果您可以更改文档以将 eventType1、eventType2 等作为值为 1 的字段,则可以跳过所有花哨的 if...then 废话,只需:

emit(doc.date, doc).

然后让您的 reduce.js 函数循环遍历rows 并将它们添加到最终看起来像这样的对象:

{eventType1: 25, eventType2: 2, ...}

for ( i = 0; i < values.length; i++){
    if ( values[i].eventType1 > 0) { eventType1 += 1 }
    if ( values[i].eventType2 > 0) { eventType2 += 1 }
    ...
}   

查询没有 group 或 group=false 的视图,您应该得到一条带有 null 键的记录和事件类型及其很重要。

我正在处理类似类型的请求。但我的“eventType”列表永远不会改变。

Do you have a fixed number of event types? Is it a small, relatively static list?

If not, skip the rest of my answer.

If so, read on for a quick and dirty option.

You could change your map.js function to have multiple emit() functions based on the event type value.

if(eventType == event1 ) {emit(doc.date, {'eventType1': 1} 

Repeating for each event type. Or, if you can change your documents to have the eventType1, eventType2, etc. as fields with 1 as the value, you can skip all the fancy if...then nonsense and just:

emit(doc.date, doc).

Then have your reduce.js function loop through the rows and add them to an object that would eventually look something like:

{eventType1: 25, eventType2: 2, ...}

for ( i = 0; i < values.length; i++){
    if ( values[i].eventType1 > 0) { eventType1 += 1 }
    if ( values[i].eventType2 > 0) { eventType2 += 1 }
    ...
}   

Query that view without group or group=false and you should get a single record with a null key with your event types and their counts.

I have this working on a similar type of request. But I my "eventType" list never changes.

一枫情书 2025-01-11 05:42:12

您可以按照@user791770所示进行操作,但不必通过稍微更改代码来对事件类型列表进行硬编码。

地图:

function(doc) {
  var data = {};
  data[doc.type] = 1;
  emit(doc.time, data);
}

减少:

function(keys, values, rereduce) {
  var data = {};
  for ( i = 0; i < values.length; i++) {
    for (var field in values[i]) {
      if (typeof data[field] == 'undefined') data[field] = 0;
      data[field] += values[i][field];
    }
  }
  return data;
}

You can do as @user791770 shows, but not have to hard-code the list of event types by changing the code slightly.

Map:

function(doc) {
  var data = {};
  data[doc.type] = 1;
  emit(doc.time, data);
}

Reduce:

function(keys, values, rereduce) {
  var data = {};
  for ( i = 0; i < values.length; i++) {
    for (var field in values[i]) {
      if (typeof data[field] == 'undefined') data[field] = 0;
      data[field] += values[i][field];
    }
  }
  return data;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文