如何正确索引多对多关联表？

发布于 2024-10-12 03:35:44 字数 1428 浏览 2 评论 0原文

在像这样的典型多对多排列中......

Movies       Actors       Movies_Actors
------       ------       -------------
movie_ID     actor_ID     FK_movie_ID
title        name         FK_actor_ID

应该如何对关联表（'Movies_Actors'）建立索引以获得最佳读取速度？

我通常看到这仅使用关联表中的复合主键完成，如下所示：

CREATE TABLE Movies_Actors (
  FK_movie_ID INTEGER,
  FK_actor_ID INTEGER,
  PRIMARY KEY (FK_movie_ID, FK_actor_ID)
)

但是，这似乎索引仅在搜索两者 movie_ID 和actor_ID （尽管我不确定复合索引是否也适用于各个列）。

由于“电影 X 中有哪些演员”和“演员 Y 出演过哪些电影”都将是该表的常见查询，因此似乎每列上都应该有一个单独的索引，以便快速找到自己的演员和电影。综合指数能有效地做到这一点吗？如果不是，那么在此表上拥有复合索引似乎毫无意义。如果复合索引毫无意义，那么主键该怎么办？候选键显然是两列的组合，但如果生成的组合索引毫无意义（它一定不是？），这似乎是一种浪费。

另外，此链接< /a> 增加了一些混乱，并表明实际上指定两个复合索引甚至可能有用...其中一个为(FK_movie_ID, FK_actor_ID)，另一个为(FK_movie_ID, FK_actor_ID)反之如(FK_actor_ID, FK_movie_ID)，选择哪个是主键（因此通常是聚集的），哪个是“只是”基于哪个方向被查询更多的唯一复合索引。

真实的故事是什么？复合索引是否会自动有效地为每一列建立索引以便在其中一列或另一列上进行搜索？最佳（读取速度，而不是大小）关联表是否应该在每个方向上都有一个复合索引，并且在每列上都有一个复合索引？幕后机制是什么？

编辑：我发现了这个相关的问题，由于某种原因我在发布之前没有找到...... 如何正确建立索引MySQL 中多对多连接的链接表？

原文

In a typical many-many arrangement like this...

Movies       Actors       Movies_Actors
------       ------       -------------
movie_ID     actor_ID     FK_movie_ID
title        name         FK_actor_ID

... how should the association table ('Movies_Actors') be indexed for optimal read speed?

I usually see this done only with the composite primary key in the association table, like so:

CREATE TABLE Movies_Actors (
  FK_movie_ID INTEGER,
  FK_actor_ID INTEGER,
  PRIMARY KEY (FK_movie_ID, FK_actor_ID)
)

However, this seems like the index will only be useful when searching for both movie_ID and actor_ID (although I'm not certain on whether a composite index also works for the individual columns).

Since both "what actors are in Movie X" and "what movies has actor Y been in" will be the common queries for this table, it seems like there should be an individual index on each column to quickly locate actors and movies on their own. Does a composite index effectively do this? If not, having a composite index seems pointless on this table. And if a composite index is pointless, what to do about a primary key? The candidate key is clearly the composite of the two columns, but if the resulting composite index is pointless (it mustn't be?) it seems like a waste.

Also, this link adds some confusion and indicates that it might even be useful to actually specify two composite indices... one of them as (FK_movie_ID, FK_actor_ID), and the other in reverse as (FK_actor_ID, FK_movie_ID), with the choice of which is the primary key (and thus usually clustered) and which is 'just' a unique composite index being based on which direction is queried more.

What is the real story? Does a composite index automatically effectively index each column for searching on one or the other? Should the optimal (in read speed, not size) association table have a composite index in each direction and one on each column? What are the behind-the-scene mechancs?

EDIT: I found this related question that for some reason I didn't locate before posting...
How to properly index a linking table for many-to-many connection in MySQL?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

说好的呢 2024-10-19 03:35:44

（虽然我不确定是否
综合指数也适用于
单独的列）。

是的，可以。但只有前缀： http:// /use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys

此外，此链接增加了一些混乱
并表明它甚至可能是
实际指定两个很有用
综合指数...其中之一为
(FK_movie_ID、FK_actor_ID)，以及
其他相反为（FK_actor_ID，
FK_movie_ID），

这实际上是要做的事情。

将一个作为聚簇索引，另一个作为非聚簇索引，无论如何都会包含聚簇索引键——因此无需再次包含该列（感谢 JNK）。

CREATE CLUSTERED INDEX a on Movies_Actors (fk_movie_id, fk_actor_id);
CREATE NONCLUSTERED INDEX b on Movies_Actors (fk_actor_id);

真实的故事是什么？

http://Use-The-Index-Luke.com/ :)

自动执行复合索引
有效地为每列建立索引
搜索其中之一？

不，只有索引的前缀。如果有索引 (a,b,c)，则查询 a=?和b=？可以使用索引。然而c=?不能，b=也不能？和c=？

应该是最佳的（读取速度，而不是
大小）关联表有一个
每个方向的综合指数和
每列一个？

如果需要在两个方向上连接，则为“是”（“每个方向上的复合索引”）和“否”（“每列一个”）。

幕后机制是什么？

好吧，再次相同的链接。

谈到 SQL Server，您最终可能还会考虑索引视图。这就是预加入。如上所述，两个索引也可能足够快。

(although I'm not certain on whether a
composite index also works for the
individual columns).

Yes, it can. But only the prefix: http://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys

Also, this link adds some confusion
and indicates that it might even be
useful to actually specify two
composite indices... one of them as
(FK_movie_ID, FK_actor_ID), and the
other in reverse as (FK_actor_ID,
FK_movie_ID),

That's actually the thing to do.

Take one as clustering index, and the other as non-clustering index that will anyways include the clustering index key--hence no need to include the that column again (thx to JNK).

CREATE CLUSTERED INDEX a on Movies_Actors (fk_movie_id, fk_actor_id);
CREATE NONCLUSTERED INDEX b on Movies_Actors (fk_actor_id);

What is the real story?

http://Use-The-Index-Luke.com/ :)

Does a composite index automatically
effectively index each column for
searching on one or the other?

No. Only the prefix of the index. If you have an index (a,b,c), the query a=? and b=? can use the index. However c=? can't, nor can b=? and c=?.

Should the optimal (in read speed, not
size) association table have a
composite index in each direction and
one on each column?

If you need to join in both directions, yes ("composite index in each direction") and no ("one on each column").

What are the behind-the-scene mechanics?

Well, same link again.

Speaking SQL Server, you might eventually also consider an indexed view. That's kind of pre-joining. Two indexes, as above, might also be fast enough.

回复收藏 0 原文