CouchDB 用于聊天记录持久化和用户统计

发布于 2024-11-15 13:44:10 字数 160 浏览 5 评论 0原文

CouchDB 或 CouchBase 是否适合作为基于 NoSQL 的持久性解决方案来存储用户聊天历史记录和统计信息?由于聊天历史可能需要写入而不是读取,带有一些统计信息的单个用户历史记录的文档结构应该是什么 - 代表用户的单个实体,具有嵌入或分离的历史数据文档(大量小文档)和一些统计信息(少量文档)?

Is CouchDB or CouchBase suitable as a persistence NoSQL-based solution for storing users chat history and statistics? Since chat history would probably require writes rather than reads what should be the document structure for a single user history with some statistics - single entity representing user with embedded or separated documents for history data (lots of small docs) and some stats (small number of docs)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

单挑你×的.吻 2024-11-22 13:44:10

是的,CouchDB 或 Couchbase 都合适。

由于聊天历史记录需要多次写入,因此我正在考虑一些使写入变得容易的方法:只需删除一个文档,然后让 CouchDB 负责聚合它。在一篇快速 POST 中,您可以描述聊天消息、发送者、时间戳、聊天室等。

CouchDB 视图排序规则将使单个实体代表用户及其历史数据。例如,如果您想知道用户消息量,您的映射函数将发出如下所示的键:

emit([doc.username, doc.year, doc.month, doc.day, doc.hour, doc.minute], 1);

并且reduce函数将所有值相加。现在您可以查询用户的年交易量,

group_level=3&startkey=["somebody",2011,null]&endkey=["somebody",2011,{}]

或者(通过增加群组级别)月交易量、日交易量、每小时交易量等。

注意事项

这种技术有成本和好处。基本的权衡是,更新应该容易,报告应该合理。在每天 10,000 次更新的示例中,我对 409 Conflict 拒绝、维护冲突解决代码或在更多消息堆积时使客户端从错误中正常恢复感到紧张!

建议的技术有帮助。每个更新都是相互隔离的,更新可能会无序发生,错误恢复也不会太糟糕。只需在后台重试几次即可。 (请注意,我个人主张更新应该很容易——也许我有偏见。)

代价是“浪费”磁盘空间,并且检索数据(相对)需要更多工作。 CouchDB 缓慢且浪费,就像卡车缓慢且浪费一样。事实上,卡车在富裕地区很常见,而在贫困地区并不常见,因为它们是更好的长期交易。从情感上讲,我们看到卡车缓慢行驶并吐出黑烟,但从理性上讲,我们知道它们的效率更高。

大多数统计数据可以是直接映射/缩减视图。但是,您还可以维护具有汇总或独立结果的“摘要”文档,或者您需要的任何其他内容。频繁更新不是问题(在这个规模上:每天 86,400 次更新仍然只是 1/秒)。但您可能需要一个专门用于这些文档的“更新程序”客户端。由于只有一个客户端在更新特殊文档,因此您不会遇到 409 冲突,因为没有其他人在努力更新同一文档。

Yes, CouchDB or Couchbase is suitable.

Since chat history requires many writes, I am thinking of something that makes writing easy: just drop a document and let CouchDB worry about aggregating it. In one quick POST you could describe the chat message, who sent it, timestamp, which chat room, etc.

CouchDB view collation will make the single entity representing a user with their historical data. For example, if you want to know user message volume, your map function will emit a key like this:

emit([doc.username, doc.year, doc.month, doc.day, doc.hour, doc.minute], 1);

And the reduce function adds up all the values. Now you can query a user's annual volume,

group_level=3&startkey=["somebody",2011,null]&endkey=["somebody",2011,{}]

or (by increasing the group level) monthly volume, daily volume, hourly volume, etc.

Considerations

This technique has costs and benefits. The basic trade-off is, updates should be easy, reports should be reasonable. In your example of 10,000 updates per day, I get nervous thinking about 409 Conflict rejections, or maintaining conflict-resolution code, or making the client gracefully recover from an error when more messages are piling up!

The suggested technique helps. Each update is isolated from the others, updates can occur out-of-order, error recovery is not too bad. Just retry a few times in the background. (Note, I am personally an advocate that updates should be easy—maybe I am biased.)

The cost is "wasting" disk space, and retrieving data is (relatively) more work. CouchDB is slow and wasteful like lorries are slow and wasteful. In reality, lorries are common in wealthy places and uncommon in poor places because they are a better long-term deal. Emotionally, we see lorries lumber about and vomit black smoke, but rationally, we know they are more efficient.

Most stats can be direct map/reduce views. However, you can also maintain "summary" documents with aggregated or independent results, or whatever else you need. Frequent updates are not a problem (on this scale: 86,400 updates per day is still just 1/sec). But you might want a dedicated "updater" client for those documents. With only one client working updating the special documents, you won't get 409 Conflicts since nobody else is fighting to update the same document.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文