我正在创建一个带有动态调查创建和查询的 Web 应用程序。提交组件。我使用 MongoDB 来存储表单的架构和表单提交。
我可以想象以几种不同的方式组织它:
-
将所有表单提交和表单架构作为单个集合中的文档。
-
为所有表单架构和所有表单提交拥有单独的集合
为所有表单架构拥有单独的集合,并为所有表单提交创建一个新集合每个架构都有一个表单。
我仍在研究这个问题,而且我来自 RDBMS 领域,我是 NoSQL 数据库的菜鸟。有人有什么建议吗?
编辑1
忘记解决将响应作为属性嵌入表单架构文档中的问题。
I'm creating a web application with a dynamic survey creation & submission component. I'm using MongoDB to store the schema of the form and the form submissions.
I can imagine organizing this in several different ways:
-
Having all form submissions and form schemas as documents in a single collection.
-
Have separate collections for all form schemas and all form submissions
-
Have a separate collection for all form schemas, and create a new collection for all submissions of a form for each schema.
I'm still researching this and I come from the world of RDBMS, I'm a noob to NoSQL databases. Anyone have any advice?
EDIT 1
Forgot to address embedding the responses as a property within the form schema document.
发布评论
评论(2)
你会想要避免这个(#1)。这里的简单原因是表单提交与表单架构具有不同的作用。将它们混合在同一个集合中将使查询变得更加困难。
为了澄清,听起来您建议使用两个集合:
schema 和
submission`。这是一种合乎逻辑的继续方式。您将拥有一个小型
schema
集合和一个大型submission
集合。关键限制是您针对该
submission
集合进行的查询。您打算查询“跨类型”吗?或者主要查询是否集中在“提交类型”?如果您最终在每个查询中都包含“提交类型”,那么这是有意义的......
原因很简单,就是索引。如果您有一个集合,则需要“类型”索引。因此,通过创建单独的集合,您可以保存索引。但是,如果您最终需要分片功能,这可能需要管理大量集合。
当然,您可以通过使用
_id
来解决这个“额外索引”。 MongoDB 有一个自动生成的ObjectId
,默认情况下会使用它,有点像自动增量 ID。但是,您可以覆盖它并创建一个更智能的_id
,例如submissionid_userid
。老实说,我更喜欢最后一个选项。但真的#2 & #3 都是不错的选择,实际上只是代码复杂性和管理复杂性方面的权衡问题。
You will want to avoid this one (#1). The simple reason here is that the form submission has a different role than the form schema. Mixing these in the same collection will make it more difficult to query.
To clarify, it sounds like you're suggesting two collections:
schema and
submission`.This is a logical way to proceed. You will have one small
schema
collection and one largesubmission
collection.The key limitation will be the queries you make against that
submission
collection. Are you planning to query "across types"? Or are major queries centered about "submission type"?If you end up including "submission type" on every query, then it makes sense to...
The reason for this is simply the indexes. If you have one collection, you will need an index on "type". So by making separate collections, you can save an index. However, if you ever end up needing the sharding features, this can make for lots of collections to manage.
Of course, you can work around this "extra index", by being creative with the
_id
. MongoDB has an auto-generatedObjectId
that it will use by default, kind of like an auto-increment ID. However, you can override this and create a smarter_id
, something likesubmissionid_userid
.My preference is honestly for the last option. But really #2 & #3 are both good options, really just an issue of trade-offs in terms of code complexity and management complexity.
我会选择两个集合:表单和提交。
这种方法可以很好地水平扩展,因为您只需担心 2 个集合。
我同意 @Gates VP 关于提供自定义
_id
而不是默认objectId
的观点,因为您不需要额外的索引。在
submissions
集合中,如果您将_id
格式设置为formID_userID
来获取所有提交内容,您需要做的就是:这里的奖励锚定的正则表达式将使用
_id_
索引 - 所以它的效率很高。对于一般参考和其他偶然发现这一点的人:有一些关于模式设计的很好的演示 - 值得一看:
http://www.10gen.com/presentations/mongodb-tokyo-2012/basic-application-and-schema-design
http://www.10gen.com/presentations/mongosv-2011/schema-design-principles-and-practice
I'd go for two collections: form and submissions.
This is the approach scales horizontally well as you only have 2 collections to worry about.
I agree with @Gates VP about providing custom
_id
rather than the defaultobjectId
as you are spared the need for an extra index.On the
submissions
collection if you set the_id
format toformID_userID
to get all the submissions all you'd need to do is:The bonus here is the anchored regex will use the
_id_
index - so its efficient.For general reference and others stumbling upon this: there are some good presentations about schema design - that are worth checking out:
http://www.10gen.com/presentations/mongodb-tokyo-2012/basic-application-and-schema-design
http://www.10gen.com/presentations/mongosv-2011/schema-design-principles-and-practice