针对 Google App Engine 上的论坛应用程序的数据建模建议
我正在 Google App Engine 上编写一个简单的类似论坛的应用程序,并试图避免可扩展性问题。我是这种非 RBDMS 方法的新手,我想从一开始就避免陷阱。
论坛设计非常简单,帖子和回复是唯一的概念。如果论坛有数百万个帖子,解决问题的最佳方法是什么?
到目前为止的模型(去掉无用的属性):
class Message(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
reply_to = db.SelfReferenceProperty() # if null is a post, if not null a reply (useful for reply-to-reply)
分割模型,我认为它更快,因为它在检索“所有帖子”时查询的项目更少:
class Post(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
class Reply(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
reply_to = db.ReferenceProperty(Post)
这是 RDBMS 世界中的多对一关系,ListProperty 应该是用来代替?如果是这样,怎么办?
编辑:
Jaiku 使用类似的东西
class StreamEntry(DeletedMarkerModel):
...
entry = models.StringProperty() # ref - the parent of this, should it be a comment
...
I'm writing a simple forum-like application on Google App Engine and trying to avoid scalability issues. I'm new to this non-RBDMS approach, i'd like to avoid pitfalls from the beginning.
The forum design is pretty simple, posts and replies will be the only concepts. What will be the best approach to the problem if the forum have millions of posts?
The model so far (stripped from useless properties):
class Message(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
reply_to = db.SelfReferenceProperty() # if null is a post, if not null a reply (useful for reply-to-reply)
Splitting the model, i think it's faster because it will query less items when retrieving "all posts":
class Post(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
class Reply(db.Model):
user = db.StringProperty() # will be a google account user_id
text = db.TextProperty() # the text of the message
reply_to = db.ReferenceProperty(Post)
This is a many-to-one relation in a RDBMS world, should a ListProperty be used instead? If so, how?
Edit:
Jaiku uses something like this
class StreamEntry(DeletedMarkerModel):
...
entry = models.StringProperty() # ref - the parent of this, should it be a comment
...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,为什么不使用
user = db.UserProperty()
而不是user = db.StringProperty()
?其次,我很确定你应该使用它有效且更具可读性的任何东西,并稍后测试性能,原因有以下三个:
所以当您已准备好测量,然后开始优化。
我这么说并不是因为我对 RDBMS、No-SQL DBMS 或 Google Datastore 性能优化一无所知,而是因为我通常从测试中获得所有相关知识,这似乎比我预期的更常见地与之前的假设相矛盾。
Firstly, why don't you use
user = db.UserProperty()
instead ofuser = db.StringProperty()
?Secondly, I'm quite sure you should use whatever it works and is more readable and test the performance later, for three reasons:
So when you are ready to measure, then start the optimizations.
I'm not saying this because I don't know nothing about RDBMS, No-SQL DBMS or Google Datastore performance optimizations, but because I usually get all my knowledge about it from testing, which seems to contradict previous assumptions more usually than I expected.
您可能想看看 一个关于从头开始创建 PHP 论坛的好教程。当然,其中一篇是关于 PHP 的,但它也涵盖了论坛设计的总体概述。
基本上,不要拆分帖子和回复或话题和帖子。稍后这会导致一些非常尴尬的查询。主题只是一个不回复任何内容的帖子。
You might want to take a look at a good tutorial on creating a php forum from scratch. Sure that one is about PHP but it also covers the general overview of forum design.
Basically, don't split posts and replies or threads and posts. It will lead to some really awkward queries later on. A thread is simply a post that isn't replying to anything.