使用 CouchDB 构建(简单的)twitter 克隆
我正在尝试构建一个(简单的)twitter 克隆,它使用 CouchDB 作为数据库后端。 由于其功能集减少,我几乎完成了编码,但还有一件事我无法使用 CouchDB 解决 - 每个用户的时间线。
与 Twitter 一样,每个用户的时间线应按时间顺序显示我关注的所有人的推文。对于 SQL,这是一个非常简单的 Select-Statement,但我不知道如何使用 CouchDBs Map/Reduce 重现它。
以下是我将与 RDBMS 一起使用的 SQL 语句:
SELECT * FROM tweets WHERE user_id IN [1,5,20,33,...] ORDER BY created_at DESC;
CouchDB 架构详细信息
user-schema:
{
_id:xxxxxxx,
_rev:yyyyyy,
"type":"user",
"user_id":1,
"username":"john",
...
}
tweet-schema:
{
"_id":"xxxx",
"_rev":"yyyy",
"type":"tweet",
"text":"Sample Text",
"user_id":1,
...
"created_at":"2011-10-17 10:21:36 +000"
}
With 查看排序规则 查询 CouchDB 以获得“按时间顺序排列的 user_id = 1 的所有推文”列表非常简单。
但是我如何检索“属于 ID 为 1,2,3,... 按时间顺序排列的用户的所有推文”的列表?我的应用程序需要另一个架构吗?
I'm trying to build a (simple) twitter-clone which uses CouchDB as Database-Backend.
Because of its reduced feature set, I'm almost finished with coding, but there's one thing left I can't solve with CouchDB - the per user timeline.
As with twitter, the per user timeline should show the tweets of all people I'm following, in a chronological order. With SQL it's a quite simple Select-Statement, but I don't know how to reproduce this with CouchDBs Map/Reduce.
Here's the SQL-Statement I would use with an RDBMS:
SELECT * FROM tweets WHERE user_id IN [1,5,20,33,...] ORDER BY created_at DESC;
CouchDB schema details
user-schema:
{
_id:xxxxxxx,
_rev:yyyyyy,
"type":"user",
"user_id":1,
"username":"john",
...
}
tweet-schema:
{
"_id":"xxxx",
"_rev":"yyyy",
"type":"tweet",
"text":"Sample Text",
"user_id":1,
...
"created_at":"2011-10-17 10:21:36 +000"
}
With view collations it's quite simple to query CouchDB for a list of "all tweets with user_id = 1 ordered chronologically".
But how do I retrieve a list of "all tweets which belongs to the users with the ID 1,2,3,... ordered chronologically"? Do I need another schema for my application?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
最好的方法是将
created_at
保存为时间戳,然后创建一个视图,并将所有推文映射到user_id
:然后以用户 ID 作为键查询视图,并在应用程序中根据需要对它们进行排序(大多数都有数组排序方法)。
最后一次编辑 - 试图将其全部放入 couchDB...请参阅修订版:)
The best way of doing this would be to save the
created_at
as a timestamp and then create a view, and map all tweets to theuser_id
:Then query the view with the user id's as keys, and in your application sort them however you want(most have a sort method for arrays).
Edited one last time - Was trying to make it all in couchDB... see revisions :)
这是一个仅限 CouchDB 的应用程序吗?或者您是否使用介于两者之间的东西来实现额外的业务逻辑。在后一种情况下,您可以通过运行多个查询来实现这一点。
这可能包括合并不同的视图。另一种方法是为每条推文添加“私人读者”列表。它允许特定于用户的(部分)视图,但也引入了为每条新推文添加读者列表,甚至在出现新关注者或取消关注操作时更新列表的复杂性。
考虑可能的操作及其频率很重要。因此,当您主要生成推文列表时,最好将复杂性转移到如何将读者信息集成到您的文档中(即将读者集成到您的推文文档中),然后轻松构建高效的视图索引。
如果您的数据有很多更改,最好将数据库设计为不要同时更新太多现有文档。相反,尝试通过添加新文档来添加数据并通过复杂视图进行聚合。
但是您已经展示了一种边缘情况,其中简单的(一维)基于列表的索引是不够的。您实际上需要二级索引来按时间和用户 ID 进行过滤(考虑到您还需要两者的部分范围)。但这在 CouchDB 中是不可能的,因此您需要通过将“查询”数据转移到文档中并在构建视图时使用它们来解决问题。
Is that a CouchDB-only app? Or do you use something in between for additional buisness logic. In the latter case, you could achieve this by running multiple queries.
This might include merging different views. Another approach would be to add a list of "private readers" for each tweet. It allows user-specific (partial) views, but also introduces the complexity of adding the list of readers for each new tweet, or even updating the list in case of new followers or unfollow operations.
It's important to think of possible operations and their frequencies. So when you're mostly generating lists of tweets, it's better to shift the complexity into the way how to integrate the reader information into your documents (i.e. integrating the readers into your tweet doc) and then easily build efficient view indices.
If you have many changes to your data, it's better to design your database not to update too many existing documents at the same time. Instead, try to add data by adding new documents and aggregate via complex views.
But you have shown an edge case where the simple (1-dimensional) list-based index is not enough. You'd actually need secondary indices to filter by time and user-ids (given that fact that you also need partial ranges for both). But this not possible in CouchDB, so you need to work around by shifting "query" data into your docs and use them when building the view.