如何建立“标签” 支持使用CouchDB吗？

发布于 2024-07-07 02:28:44 字数 411 浏览 16 评论 0原文

我使用以下视图函数来迭代数据库中的所有项目（以便找到标签），但我认为如果数据集很大，性能会很差。还有其他方法吗？

def by_tag(tag):
return  '''
        function(doc) {
            if (doc.tags.length > 0) {
                for (var tag in doc.tags) {
                    if (doc.tags[tag] == "%s") {
                        emit(doc.published, doc)
                    }
                }
            }
        };
        ''' % tag

原文

I'm using the following view function to iterate over all items in the database (in order to find a tag), but I think the performance is very poor if the dataset is large.
Any other approach?

def by_tag(tag):
return  '''
        function(doc) {
            if (doc.tags.length > 0) {
                for (var tag in doc.tags) {
                    if (doc.tags[tag] == "%s") {
                        emit(doc.published, doc)
                    }
                }
            }
        };
        ''' % tag

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

总以为 2024-07-14 02:28:44

免责声明：我没有对此进行测试，也不知道它是否可以表现得更好。

创建单个 perm 视图：

function(doc) {
  for (var tag in doc.tags) {
    emit([tag, doc.published], doc)
  }
};

并查询
_view/your_view/all?startkey=['your_tag_here']&endkey=['your_tag_here', {}]

生成的 JSON 结构会略有不同，但您仍然会得到发布日期排序。

Disclaimer: I didn't test this and don't know if it can perform better.

Create a single perm view:

function(doc) {
  for (var tag in doc.tags) {
    emit([tag, doc.published], doc)
  }
};

And query with
_view/your_view/all?startkey=['your_tag_here']&endkey=['your_tag_here', {}]

Resulting JSON structure will be slightly different but you will still get the publish date sorting.

回复收藏 0 原文

七月上 2024-07-14 02:28:44

正如巴哈迪尔建议的那样，您可以定义一个永久视图。但是，在进行此类索引时，不要输出每个键的文档。相反，emit([tag, doc.published], null)。在当前的发行版本中，您必须对每个文档进行单独的查找，但是 SVN trunk 现在支持在查询字符串中指定“include_docs=True”，并且 CouchDB 会自动将文档合并到您的视图中，而无需空间开销。

回复收藏 0 原文

风苍溪 2024-07-14 02:28:44

您的观点非常正确。不过，有一个想法列表：

视图生成是增量的。如果您的读取流量大于写入流量，那么您的视图根本不会引起问题。对此感到担忧的人通常不应该如此。参考框架，如果您将数百条记录转储到视图中而不进行更新，您应该担心。

发出整个文档会减慢速度。您应该只发出使用视图所需的内容。

不确定 val == "%s" 的性能会如何，但你不应该想太多。如果有标签数组，您应该发出标签。当然，如果您期望标签数组包含非字符串，则忽略它。

回复收藏 0 原文

猫九 2024-07-14 02:28:44

# Works on CouchDB 0.8.0
from couchdb import Server # http://code.google.com/p/couchdb-python/

byTag = """
function(doc) {
if (doc.type == 'post' && doc.tags) {
    doc.tags.forEach(function(tag) {
        emit(tag, doc);
    });
}
}
"""

def findPostsByTag(self, tag):
    server = Server("http://localhost:1234")
    db = server['my_table']
    return [row for row in db.query(byTag, key = tag)]

byTag 映射函数返回“key”中每个唯一标签的数据，然后返回 value 中带有该标签的每个帖子，因此当您获取 key =“mytag”时，它将检索带有该标签的所有帖子标签“mytag”。

我已经针对大约 10 个条目对其进行了测试，每个查询似乎需要大约 0.0025 秒，不确定它对于大型数据集的效率如何。

# Works on CouchDB 0.8.0
from couchdb import Server # http://code.google.com/p/couchdb-python/

byTag = """
function(doc) {
if (doc.type == 'post' && doc.tags) {
    doc.tags.forEach(function(tag) {
        emit(tag, doc);
    });
}
}
"""

def findPostsByTag(self, tag):
    server = Server("http://localhost:1234")
    db = server['my_table']
    return [row for row in db.query(byTag, key = tag)]

The byTag map function returns the data with each unique tag in the "key", then each post with that tag in value, so when you grab key = "mytag", it will retrieve all posts with the tag "mytag".

I've tested it against about 10 entries and it seems to take about 0.0025 seconds per query, not sure how efficient it is with large data sets..

回复收藏 0 原文

~没有更多了~