不知道如何在 CouchDB 中创建特定的 MapReduce
我的数据库中有 3 种类型的文档:(
{
param: "a",
timestamp: "t"
} (Type 1)
{
param: "b",
partof: "a"
} (Type 2)
{
param: "b",
timestamp: "x"
} (Type 3)
我无法更改布局...;-( )
类型 1 定义开始时间戳,就像开始事件。类型 1 连接到多个类型 3 文档 我
我想获取最新的类型 3(最高时间戳)和相应的类型 1 文档,
该如何组织我的 Map/Reduce?
I've got 3 types of documents in my db:
{
param: "a",
timestamp: "t"
} (Type 1)
{
param: "b",
partof: "a"
} (Type 2)
{
param: "b",
timestamp: "x"
} (Type 3)
(I can't alter the layout...;-( )
Type 1 defines a start timestamp, it's like the start event. A Type 1 is connected to several Type 3 docs by Type 2 documents.
I want to get the latest Type 3 (highest timestamp) and the corresponding type 1 document.
How may I organize my Map/Reduce?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
简单的。对于高度相关的数据,请使用关系数据库。
Easy. For highly relational data, use a relational database.
正如用户 jhs 在我之前所说,您的数据是相关的,如果您无法更改它,那么您可能需要重新考虑使用 CouchDB。
关系是指数据中的每个“类型 1”或“类型 3”文档仅“了解”自身,而“类型 2”文档则掌握有关其他文档之间关系的知识类型。使用 CouchDB,您只能按文档本身中的字段建立索引,并在使用
includedocs=true
查询时更深入一层。因此,您所要求的内容无法通过单个 CouchDB 查询来实现,因为某些所需的数据距离所请求的文档有两个级别的距离。这是一个两次查询的解决方案:
首先使用
param-by-timestamp?reduce=true
进行查询,以获取value[0]
中的最新时间戳及其相应的参数在value[1]
中,然后使用partof-by-param?key=""
再次查询。如果您需要获取完整文档以及时间戳和参数,那么您将必须使用includedocs=true
并提供正确的_doc
值。As user jhs stated before me, your data is relational, and if you can't change it, then you might want to reconsider using CouchDB.
By relational we mean that each "type 1" or "type 3" document in your data "knows" only about itself, and "type 2" documents hold the knowledge about the relation between documents of the other types. With CouchDB, you can only index by fields in the documents themselves, and going one level deeper when querying using
includedocs=true
. Thus, what you asked for cannot be achieved with a single CouchDB query, because some of the desired data is two levels away from the requested document.Here is a two-query solution:
You query it first with
param-by-timestamp?reduce=true
to get the latest timestamp invalue[0]
and its corresponding param invalue[1]
, and then query again withpartof-by-param?key="<what you got in previous query>"
. If you need to fetch the full documents together with the timestamp and param, then you will have to play withincludedocs=true
and provide with the correct_doc
values.