Mongo 分片无法在分片之间分割大型集合

发布于 2024-09-18 16:15:35 字数 1781 浏览 13 评论 0原文

我在 mongo 中看似简单的分片设置遇到了问题。

我有两个分片、一个 mongos 实例和一个配置服务器，设置如下：

机器 A - 10.0.44.16 - 配置服务器，mongos
机器 B - 10.0.44.10 - 分片 1
机器 C - 10.0.44.11 - 分片 2

我有一个名为“Seeds”的集合，它有一个分片键“SeedType”，该字段是集合中每个文档中都存在的字段，并且包含四个值之一（看看下面的分片状态）。其中两个值的条目数明显多于其他两个值（其中两个值各有 784,000 条记录，另外两个值大约有 5,000 条记录）。

我期望看到的行为是，带有 InventoryPOS 的“Seeds”集合中的记录最终将出现在一个分片上，而带有 InventoryOnHand 的记录将最终出现在另一个分片上。

然而，两个较大分片键的所有记录似乎最终都位于主分片上。

这是我的分片状态文本（为了清楚起见，删除了其他集合）：

--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
      { "_id" : "shard0000", "host" : "10.44.0.11:27019" }
      { "_id" : "shard0001", "host" : "10.44.0.10:27017" }
  databases:
        { "_id" : "admin", "partitioned" : false, "primary" : "config" }
        { "_id" : "TimMulti", "partitioned" : true, "primary" : "shard0001" }
                TimMulti.Seeds chunks:
                        { "SeedType" : { $minKey : 1 } } -->> { "SeedType" : "PBI.AnalyticsServer.KPI" } on : shard0000 { "t" : 2000, "i" : 0 }
                        { "SeedType" : "PBI.AnalyticsServer.KPI" } -->> { "SeedType" : "PBI.Retail.InventoryOnHand" } on : shard0001 { "t" : 2000, "i" : 7 }
                        { "SeedType" : "PBI.Retail.InventoryOnHand" } -->> { "SeedType" : "PBI.Retail.InventoryPOS" } on : shard0001 { "t" : 2000, "i" : 8 }
                        { "SeedType" : "PBI.Retail.InventoryPOS" } -->> { "SeedType" : "PBI.Retail.SKU" } on : shard0001 { "t" : 2000, "i" : 9 }
                        { "SeedType" : "PBI.Retail.SKU" } -->> { "SeedType" : { $maxKey : 1 } } on : shard0001 { "t" : 2000, "i" : 10 }

我做错了什么吗？

半不相关的问题：

在不阻塞整个 mongo 服务的情况下，以原子方式将对象从一个集合传输到另一个集合的最佳方法是什么？

提前致谢， -蒂姆

原文

I'm having problems with what seems to be a simple sharding setup in mongo.

I have two shards, a single mongos instance, and a single config server set up like this:

Machine A - 10.0.44.16 - config server, mongos
Machine B - 10.0.44.10 - shard 1
Machine C - 10.0.44.11 - shard 2

I have a collection called 'Seeds' that has a shard key 'SeedType' which is a field that is present on every document in the collection, and contains one of four values (take a look at the sharding status below). Two of the values have significantly more entries than the other two (two of them have 784,000 records each, and two have about 5,000).

The behavior I'm expecting to see is that records in the 'Seeds' collection with InventoryPOS will end up on one shard, and the ones with InventoryOnHand will end up on the other.

However, it seems that all records for both the two larger shard keys end up on the primary shard.

Here's my sharding status text (other collections removed for clarity):

--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
      { "_id" : "shard0000", "host" : "10.44.0.11:27019" }
      { "_id" : "shard0001", "host" : "10.44.0.10:27017" }
  databases:
        { "_id" : "admin", "partitioned" : false, "primary" : "config" }
        { "_id" : "TimMulti", "partitioned" : true, "primary" : "shard0001" }
                TimMulti.Seeds chunks:
                        { "SeedType" : { $minKey : 1 } } -->> { "SeedType" : "PBI.AnalyticsServer.KPI" } on : shard0000 { "t" : 2000, "i" : 0 }
                        { "SeedType" : "PBI.AnalyticsServer.KPI" } -->> { "SeedType" : "PBI.Retail.InventoryOnHand" } on : shard0001 { "t" : 2000, "i" : 7 }
                        { "SeedType" : "PBI.Retail.InventoryOnHand" } -->> { "SeedType" : "PBI.Retail.InventoryPOS" } on : shard0001 { "t" : 2000, "i" : 8 }
                        { "SeedType" : "PBI.Retail.InventoryPOS" } -->> { "SeedType" : "PBI.Retail.SKU" } on : shard0001 { "t" : 2000, "i" : 9 }
                        { "SeedType" : "PBI.Retail.SKU" } -->> { "SeedType" : { $maxKey : 1 } } on : shard0001 { "t" : 2000, "i" : 10 }

Am I doing anything wrong?

Semi-unrelated question:

What is the best way to atomically transfer an object from one collection to another without blocking the entire mongo service?

Thanks in advance,
-Tim

分享到QQ

分享到微博