MySQL：将一张表拆分为多个表（同一列）以提高性能？

发布于 2024-08-14 04:44:49 字数 888 浏览 6 评论 0原文

这个问题是关于性能的，而不是关于可能的解决方案。

我的系统包含许多不同类别的项目。每个类别都有自己的表，因为每个表都有很多行并且字段不同。

ItemA - id, fld1, fld2
ItemB - id, fld1, fld3, fld4
ItemC - id, fld1, fld3, fld5
....

现在需要管理用户库存，这意味着用户是否拥有商品。一种选择是使用单个表：

Inventory - category_id, item_id, user_id

ItemA、ItemB、...行的category_id 不同，这就是我们区分的方式。

第二个选项是：

InventoryA - item_id, user_id
InventoryB - item_id, user_id
...

第一个选项可能是最容易管理的，但是库存表很大（数量级：所有类别的项目数乘以用户数）并且经常更新和频繁查询。

第二种选择会有点难以管理（因为我们为每个类别创建一个新的库存表），但可能会带来性能增益，因为它可能会防止竞争条件。由于类别是完全隔离的，因此单个查询可能不需要涉及多个库存表。

目前系统使用MySQL和InnoDB引擎。大约有 10 个类别，但预计在不久的将来会增加到几十个。最大的类别有> 200k 件商品，大多数都有 > 10k 项。单个库存表有> 1000 万行，并且随着更多用户的加入预计会变得更大。

我知道最好的方法是测试这两种方法的性能并做出决定，但事实是，迁移到多表设计不会那么快速和轻松。

如果您有类似问题的个人经验，请分享。

谢谢

原文

This question is about performance, not about possible solutions.

My system holds many items of different categories. Each category has its own table since each table has many rows AND the fields are different.

ItemA - id, fld1, fld2
ItemB - id, fld1, fld3, fld4
ItemC - id, fld1, fld3, fld5
....

Now there's a need to manage user inventory, meaning the user has an item or not. One option is using a single table:

Inventory - category_id, item_id, user_id

category_id is different for ItemA, ItemB, ... rows and that's how we differentiate.

Second option is to have:

InventoryA - item_id, user_id
InventoryB - item_id, user_id
...

The first option is probably the easiest to manage BUT the inventory table is huge (order of magnitude: number of items on all categories times number of users) and frequently updated and frequently queried.

The second option would be a bit harder to manage (as we create a new inventory table for each category) but may introduce a performance gain as it might prevent race conditions. No single query is likely to require involving more than one of the inventory tables as the categories are quite segregated.

Currently the system uses MySQL and InnoDB engine.
There are ~10 categories but expected to grow to a few dozens in the near future. Biggest category has > 200k items and most have > 10k items.
The single inventory table has > 10M rows and is expected to get MUCH bigger as more users join.

I know the best is to test the performance of both methods and decide but the truth is that it won't be so quick and painless to move to the multiple table design.

If you have personal experience with a similar issue, please share it.

Thanks

分享到QQ

分享到微博