该数据模型的性能优势?

发布于 2024-08-07 04:19:03 字数 432 浏览 6 评论 0原文

我有一个 MySQL(innodb) 表“items”,具有以下特征:

  1. 大量行,并且不断增加。
  2. 大量各种数据类型的列,包括“文本”;
  3. 主键“item_id”存在。

还有如下额外需求:

  1. 需要根据状态查询项目
  2. 需要更新状态

以上两种操作发生的频率相当高。

鉴于上述情况,我有两个问题:

  1. 制作一个包含两列(即 item_idstatus)并以 item_id 作为主键的单独表可以提高性能吗?
  2. 如果上述情况属实,我将如何处理基于状态的 item_ids 查询?

我在处理数据库方面缺乏经验。我希望你能容忍我:)

I have a MySQL(innodb) table 'items' with the following characteristics

  1. Large number of rows, and keeps on increasing.
  2. Large number of columns of various data-types including 'text';
  3. primary key 'item_id' is present.

There are additional requirements as follows:

  1. Need to query items based on their status
  2. Need to update status

The above two operations happen quite frequently.

Given the above scenario I have two questions

  1. Would making a separate table with two columns namely item_id and status with item_id as primary key provide increased performance?
  2. If the above is true, how am I going to tackle querying item_ids based on status?

I am inexperienced in handling databases. I hope you will bear with me :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

清欢 2024-08-14 04:19:03

这称为垂直分割。当数据实体具有多种访问模式,以不同的频率访问实体属性(表列)的不同子集时,通常会使用它。如果一个功能需要每秒访问一到两列 100 次,而另一个应用程序功能需要访问所有其他列,但每天只访问一次或两次,那么这种方法是有道理的,并将获得显着的性能改进。

基本上,正如您所建议的,您将表“拆分”为两个表,两个表都具有相同的键,并且具有一对一的 FK/PK->PK 关系。在一个表中,您仅放置访问频率较高的几列,而将其余的列放置在另一个表中访问频率较低的列中。然后,您可以根据每个表的实际访问模式分别更适当地对每个表应用索引。

This is called vertical segmentation. It is often used when a data entity has multiple access patterns which access different subsets of the entities attributes (table columns), with different frequencies. If one function needs access to only one or two columns 100s of times per second, and another application function needs access to all the other columns, but only once or twice a day, then this approach is warrented, and will garner substantial perfomance improvement.

Basically, as you suggested, you "split" the table into two tables, both with the same key, with a one-to-one FK/PK->PK relationship. In one table you put only those few columns that are accessed more frequently, and you put the rest of the columns in the other table that will be accessed less frequently. You can then apply indexing to each table more appropriately based on the actual access pattern for each table separately.

甜嗑 2024-08-14 04:19:03

如果您需要获取的唯一列,则在您的状态和 item_id 上创建索引会更有意义。

create index status_item_id_items on items (status)

然后,您可以查询将使用此索引的结果:

select item_id, status from items where status = 'status'

请记住,如果您没有许多不同的状态,您的查询最终可能会返回大量行,并且速度可能会很慢。如果您可以受到更具“选择性”的列(例如日期时间)的限制,那就更好了。

Would make more sense to create an index on your status and your item_id if its the only columns you need to fetch.

create index status_item_id_items on items (status)

You can then query your result that will use this index:

select item_id, status from items where status = 'status'

Keep in mind that if you don't have many different statuses your query may ends up returning a lot of row and could be slow. If you can be constrained by a more 'selective' column like a datetime it would be better.

夜巴黎 2024-08-14 04:19:03

首先回答第 2 部分,您将对两个表进行内部联接:

SELECT i.*, s.StatusCode FROM items AS i INNER JOIN status AS s ON s.item_id = i.item_id

不过,为了回答第 1 部分,我认为这样做不会为您带来任何性能优势。

Answering part 2 first, you'd do an inner join of your two tables:

SELECT i.*, s.StatusCode FROM items AS i INNER JOIN status AS s ON s.item_id = i.item_id

To answer part 1, though, I don't think doing this would gain you any performance advantage.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文