使用 MongoDB 有效确定层次结构中记录的所有者

发布于 2024-12-17 07:27:47 字数 2819 浏览 5 评论 0原文

我正在努力实现以下目标:

选择我拥有的所有记录,其中所有权是我创建的对象或我管理的用户创建的对象,其中用户管理可以在管理用户的用户层次结构中进行

所有权显然很简单,可以通过对应的简单 id 来处理给业主。用户管理的层次结构让我有点难以执行,而无需通过大量 ID 列表进行繁重的工作(显然,您可以找到受管理的每个用户,并使用 IN 子句或类似的子句列出这些用户中的任何一个创建的每个对象)。

理想情况下,这一切都发生在单个查询中,以便可以发生正常的分页和条件。

我在想可能需要一些数学来完成它——拥有可以以某种方式进行散列的 ID,以确定它们是否由指挥链中的任何人拥有。

对于此类事情有什么参考吗?

我错过了一些明显的东西吗?

如果能有所作为的话,可以使用 MongoDB,但也很乐意考虑其他数据库来获取灵感。

更新: 创建了一个包含 1,000,000 条记录的 MongoDB 集合,以获取一些可靠的数据,准确了解查询中 IN 子句的可管理参数数量的构成。当我有一些具体信息时会报告。

分析:

使用 ruby​​-mongo-driver 和 ruby​​ 基准库。

具有 1039944 条记录的 MongoDB 集合

记录定义为:

{
    first_name: String,
    last_name: String,
    email: String,
    phone: String,
    company: String,
    owner: BSON::ObjectId
 }

所有字段的值都是随机生成的。

所有者字段有一个索引。

使用以下条件运行查询:

conditions = {"owner" => { "$in" => id_list }}
opts = {skip: rand, limit: 100}

结果:

    # 10201 ids
    #              user     system      total        real
    # 0:       0.240000   0.000000   0.240000 (  0.265148)
    # 1:       0.240000   0.010000   0.250000 (  0.265757)
    # 2:       0.240000   0.000000   0.240000 (  0.267149)
    # 3:       0.240000   0.000000   0.240000 (  0.269981)
    # 4:       0.240000   0.000000   0.240000 (  0.270436)
    # Find:    0.240000   0.000000   0.240000 (  0.266709)


    # 5201 ids
    #              user     system      total        real
    # 0:       0.120000   0.000000   0.120000 (  0.133824)
    # 1:       0.120000   0.000000   0.120000 (  0.134787)
    # 2:       0.110000   0.000000   0.110000 (  0.133262)
    # 3:       0.110000   0.000000   0.110000 (  0.136046)
    # 4:       0.120000   0.000000   0.120000 (  0.141220)
    # Find:    0.130000   0.000000   0.130000 (  0.139110)

    # 201 ids
    #              user     system      total        real
    # 0:       0.010000   0.000000   0.010000 (  0.006044)
    # 1:       0.000000   0.000000   0.000000 (  0.004681)
    # 2:       0.010000   0.000000   0.010000 (  0.004578)
    # 3:       0.000000   0.000000   0.000000 (  0.007048)
    # 4:       0.010000   0.000000   0.010000 (  0.008487)
    # Find:    0.000000   0.000000   0.000000 (  0.005990)

    # 1 id (NOT using IN)
    #              user     system      total        real
    # 0:       0.000000   0.000000   0.000000 (  0.002868)
    # 1:       0.000000   0.000000   0.000000 (  0.004937)
    # 2:       0.010000   0.000000   0.010000 (  0.003151)
    # 3:       0.000000   0.000000   0.000000 (  0.002983)
    # 4:       0.000000   0.000000   0.000000 (  0.003313)
    # Find:    0.000000   0.000000   0.000000 (  0.002742)

即使查询中包含 10k id 列表,性能也相当快。

I am trying to achieve the following:

Select all the records that I own, where ownership is objects I have created OR objects a user I manage has created, where user management can be in a hierarchy of users managing users

Ownership is clearly straight forward and could be handled by a simple id corresponding to the owner. The hierarchy of user management has me a little stumped to perform without heavy lifting through large lists of IDs (you can obviously just find every user that is managed and list every object created by any of those users using an IN clause or similar).

Ideally this all happens in a single query so normal paging and conditions can occur.

I was thinking that there might have been some maths to get it done - having IDs that can somehow be hashed to determine if they are owned by anyone in the chain of command.

Any references for this sort of thing?

Am I missing something obvious?

Using MongoDB if that makes a difference, but happy to think about other databases for inspiration.

UPDATE:
have created a MongoDB collection with 1,000,000 records to get some solid data on exactly what constitutes a manageable number of parameters for an IN clause on a query. Will report back when I have some concrete information.

ANALYSIS:

Using ruby-mongo-driver and the ruby benchmark lib.

MongoDB Collection with 1039944 records

Records are defined as:

{
    first_name: String,
    last_name: String,
    email: String,
    phone: String,
    company: String,
    owner: BSON::ObjectId
 }

With randomly generated values for all fields.

The Owner field has an index.

Running queries with the following conditions:

conditions = {"owner" => { "$in" => id_list }}
opts = {skip: rand, limit: 100}

Results:

    # 10201 ids
    #              user     system      total        real
    # 0:       0.240000   0.000000   0.240000 (  0.265148)
    # 1:       0.240000   0.010000   0.250000 (  0.265757)
    # 2:       0.240000   0.000000   0.240000 (  0.267149)
    # 3:       0.240000   0.000000   0.240000 (  0.269981)
    # 4:       0.240000   0.000000   0.240000 (  0.270436)
    # Find:    0.240000   0.000000   0.240000 (  0.266709)


    # 5201 ids
    #              user     system      total        real
    # 0:       0.120000   0.000000   0.120000 (  0.133824)
    # 1:       0.120000   0.000000   0.120000 (  0.134787)
    # 2:       0.110000   0.000000   0.110000 (  0.133262)
    # 3:       0.110000   0.000000   0.110000 (  0.136046)
    # 4:       0.120000   0.000000   0.120000 (  0.141220)
    # Find:    0.130000   0.000000   0.130000 (  0.139110)

    # 201 ids
    #              user     system      total        real
    # 0:       0.010000   0.000000   0.010000 (  0.006044)
    # 1:       0.000000   0.000000   0.000000 (  0.004681)
    # 2:       0.010000   0.000000   0.010000 (  0.004578)
    # 3:       0.000000   0.000000   0.000000 (  0.007048)
    # 4:       0.010000   0.000000   0.010000 (  0.008487)
    # Find:    0.000000   0.000000   0.000000 (  0.005990)

    # 1 id (NOT using IN)
    #              user     system      total        real
    # 0:       0.000000   0.000000   0.000000 (  0.002868)
    # 1:       0.000000   0.000000   0.000000 (  0.004937)
    # 2:       0.010000   0.000000   0.010000 (  0.003151)
    # 3:       0.000000   0.000000   0.000000 (  0.002983)
    # 4:       0.000000   0.000000   0.000000 (  0.003313)
    # Find:    0.000000   0.000000   0.000000 (  0.002742)

Even with a list of 10k ids in the query, performance is pretty snappy.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

燕归巢 2024-12-24 07:27:47

如果您尝试根据“列”从 MongoDB 中“选择”记录,该“列”的值来自一组可能的值,您需要针对用户管理表进行联接才能确定该值,那么 NoSQL 就会对您不利......

如果用户 ID 列表仍然可以管理,您可以执行 whereownerId in (?,?,?,?,?...) 类型的查询(在首先确定列表之后)

db.documents.find({owner:{$in: [1234, 2345, 4444, 77777, 99999]}})

: NoSQL 方式可能是将事物非规范化,例如通过包含不仅仅是文档中的所有者 ID,还有管理层次结构的完整路径:

{  _id: 'the document A',
   owner : 1234,
   managers: [ 2345, 4444, 77777, 99999 ]
}

当然,当用户层次结构发生变化时,需要更新它。

If you are trying to "select" records from MongoDB based on a "column" having a value from a set of possible values that you'd need a join against a user management table to determine, then NoSQL is working against you...

If the list of user IDs is still manageable you can do a where ownerId in (?,?,?,?,?...) type of query (after having first determined the list):

db.documents.find({owner:{$in: [1234, 2345, 4444, 77777, 99999]}})

The NoSQL way is probably to denormalize things, for example by including not just the ownerId in the document, but the complete path up the management hierarchy:

{  _id: 'the document A',
   owner : 1234,
   managers: [ 2345, 4444, 77777, 99999 ]
}

Of course, that will need to be updated when the user hierarchy gets shifted around.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文