mySQL KEY 使用三个表字段(列)进行分区

发布于 2024-08-16 01:58:12 字数 1528 浏览 3 评论 0原文

我正在写一个数据仓库,使用MySQL作为后端。我需要根据两个整数 ID 和一个名称字符串对表进行分区。我已经阅读了有关分区的 mySQL 文档(部分),似乎在这种情况下最合适的分区方案是 HASH 或 KEY 分区。

我选择了 KEY 分区,因为我(退出并且)不想负责为我的字段提供“无冲突”哈希算法 - 相反,我依靠 MySQL 哈希来生成哈希所需的密钥。

我在下面包含了我想要根据以下字段的组合进行分区的表架构的片段:

学校 ID、课程 ID、SSNAME(学生姓氏)。

顺便说一句,在有人指出这不是存储学校相关信息的最佳方式之前,我必须指出,我只是使用下面的案例来类比我想要建模的内容。

我当前的 CREATE TABLE 语句如下所示:

CREATE TABLE foobar (
    id         int UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
    school_id  int UNSIGNED NOT NULL,
    course_id  int UNSIGNED NOT NULL,
    ssname     varchar(64) NOT NULL,

    /* some other fields */

    FOREIGN KEY (school_id) REFERENCES school(id) ON DELETE RESTRICT ON UPDATE CASCADE,

    FOREIGN KEY (course_id) REFERENCES course(id) ON DELETE RESTRICT ON UPDATE CASCADE,

    INDEX idx_fb_si (school_id),
    INDEX idx_fb_ci (course_id),
    CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname(16))
) ENGINE=innodb;

我想知道如何修改上面的语句,以便使用我在本问题开头提到的三个字段(即 - school_id、course_id 和学生姓)。

我想问的另一个问题是:

在“边缘”情况下会发生什么,例如,如果我尝试插入包含有效* school_id、course_id 或姓氏的记录(不存在底层分区表文件),mySQL 将自动创建底层文件。?

举个例子。我有以下学校:纽约幼儿园、贝尔法斯特小学和以下课程:无维数中的李代数、纠缠实体

另外假设我有以下学生(姓氏):布什、布莱尔、侯赛因

当我添加新学校(或课程)时,或学生),我可以将它们插入到 foobar 表中吗(实际上,我想不出为什么不可以)。我问的原因是我预计会添加更多的学校和课程等,这意味着 mySQL 将必须在幕后创建额外的表(因为哈希将生成新的键)。

如果在该领域有经验的人能够确认(最好有支持其主张的链接)我的理解(即,如果我向数据库添加新学校、课程或学生,则不需要手动管理)是正确的,我将不胜感激。

我不知道我的第二个问题是否格式正确(清晰)。如果没有,我很乐意进一步澄清。

*有效 - 通过有效,我的意思是它在不破坏引用完整性方面是有效的。

I am writing a data warehouse, using MySQL as the back-end. I need to partition a table based on two integer IDs and a name string. I have read (parts of) the mySQL documentation regarding partitioning, and it seems the most appropriate partitioning scheme in this scenario would be either a HASH or KEY partitioning.

I have elected for a KEY partitioning because I (chicked out and) dont want to be responsible for providing a 'collision free' hashing algorithm for my fields - instead, I am relying on MySQL hashing to generate the keys required for hashing.

I have included below, a snippet of the schema of the table that I would like to partition based on the COMPOSITE of the following fields:

school id, course_id, ssname (student surname).

BTW, before anyone points out that this is not the best way to store school related information, I'll have to point out that I am only using the case below as an analogy to what I am trying to model.

My Current CREATE TABLE statement looks like this:

CREATE TABLE foobar (
    id         int UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
    school_id  int UNSIGNED NOT NULL,
    course_id  int UNSIGNED NOT NULL,
    ssname     varchar(64) NOT NULL,

    /* some other fields */

    FOREIGN KEY (school_id) REFERENCES school(id) ON DELETE RESTRICT ON UPDATE CASCADE,

    FOREIGN KEY (course_id) REFERENCES course(id) ON DELETE RESTRICT ON UPDATE CASCADE,

    INDEX idx_fb_si (school_id),
    INDEX idx_fb_ci (course_id),
    CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname(16))
) ENGINE=innodb;

I would like to know how to modify the statement above so that the table is partitioned using the three fields I mentioned at the begining of this question (namely - school_id, course_id and the starting letter of the students surname).

Another question I would like to ask is this:

What happens in 'edge' situations for example if I attempt to insert a record that contains a valid* school_id, course_id or surname - for which no underlying partitioned table file exists - will mySQL automatically create the underlying file.?

Case in point. I have the following schools: New York Kindergaten, Belfast Elementary and the following courses: Lie Algebra in Infitesmal Dimensions, Entangled Entities

Also assume I have the following students (surnames): Bush, Blair, Hussein

When I add a new school (or course, or student), can I insert them into the foobar table (actually, I cant think why not). The reason I ask is that I forsee adding more schools and courses etc, which means that mySQL will have to create additional tables behind the scenes (as the hash will generate new keys).

I will be grateful if someone with experience in this area can confirm (preferably with links backing their assertion), that my understanding (i.e. no manual administration is required if I add new schools, courses or students to the database), is correct.

I dont know if my second question was well formed (clear) or not. If not, I will be glad to clarify further.

*VALID - by valid, I mean that it is valid in terms of not breaking referential integrity.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

醉南桥 2024-08-23 01:58:12

我怀疑分区是否像您想象的那么有用。也就是说,您所要求的内容还存在一些其他问题(注意:这个答案的全部内容适用于 MySQL 5;版本 6 可能有所不同):

  • KEY 分区中使用的列必须是主键的一部分。 school_idcourse_idssname 不是主键的一部分。
  • 更一般地说,每个 UNIQUE 键(包括主键)必须包含分区中的所有列1。这意味着您只能对 UNIQUE 键中的列的交集进行分区。在您的示例中,交叉点是空的。
  • 大多数分区方案(KEY 除外)需要整数或空值。如果不为 NULL,ssname 将不是整数值。
  • 不同时支持外键和分区 2。这是不使用分区的有力论据。

幸运的是,无冲突散列是您无需担心的一件事,因为分区会导致冲突(否则,每个分区中只有一行)。如果您可以忽略上述问题以及 限制对于分区表达式中使用的函数,您可以使用以下命令创建 HASH 分区:

CREATE TABLE foobar (
    ...
) ENGINE=innodb
  PARTITION BY HASH (school_id + course_id + ORD(ssname))
  PARTITIONS 2
;

应该起作用的是:

CREATE TABLE foobar (
    id         int UNSIGNED NOT NULL AUTO_INCREMENT,
    school_id  int UNSIGNED NOT NULL,
    course_id  int UNSIGNED NOT NULL,
    ssname     varchar(64) NOT NULL,

    /* some other fields */

    PRIMARY KEY (id, school_id, course_id),
    INDEX idx_fb_si (school_id),
    INDEX idx_fb_ci (course_id),
    CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
      PARTITION BY HASH (school_id + course_id)
      PARTITIONS 2
;

或:

CREATE TABLE foobar (
    id         int UNSIGNED NOT NULL AUTO_INCREMENT,
    school_id  int UNSIGNED NOT NULL,
    course_id  int UNSIGNED NOT NULL,
    ssname     varchar(64) NOT NULL,

    /* some other fields */

    PRIMARY KEY (id, school_id, course_id, ssname),
    INDEX idx_fb_si (school_id),
    INDEX idx_fb_ci (course_id),
    CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
      PARTITION BY KEY (school_id, course_id, ssname)
      PARTITIONS 2
;

至于存储表的文件,MySOL 将创建它们,尽管它可能会在您定义表时创建它们,而不是与向其中插入行时相比。您无需担心 MySQL 如何管理文件。请记住,分区数量是有限的,是在您创建表时通过 PARTITIONS *n* 子句定义的。

I doubt partitioning is as useful as you think. That said, there are a couple of other problems with what you're asking for (note: the entirety of this answer applies to MySQL 5; version 6 might be different):

  • columns used in KEY partitioning must be a part of the primary key. school_id, course_id and ssname are not part of the primary key.
  • more generally, every UNIQUE key (including the primary key) must include all columns in the partition1. This means you can only partition on the intersection of the columns in the UNIQUE keys. In your example, the intersection is empty.
  • most partitioning schemes (other than KEY) require integer or null values. If not NULL, ssname will not be an integer value.
  • foreign keys and partitioning aren't supported simultaneously2. This is a strong argument not to use partitioning.

Fortunately, collision free hashing is one thing you don't need to worry about, because partitioning is going to result in collisions (otherwise, you'd only have a single row in each partition). If you could ignore the above problems as well as the limitations on functions used in partitioning expressions, you could create a HASH partition with:

CREATE TABLE foobar (
    ...
) ENGINE=innodb
  PARTITION BY HASH (school_id + course_id + ORD(ssname))
  PARTITIONS 2
;

What should work is:

CREATE TABLE foobar (
    id         int UNSIGNED NOT NULL AUTO_INCREMENT,
    school_id  int UNSIGNED NOT NULL,
    course_id  int UNSIGNED NOT NULL,
    ssname     varchar(64) NOT NULL,

    /* some other fields */

    PRIMARY KEY (id, school_id, course_id),
    INDEX idx_fb_si (school_id),
    INDEX idx_fb_ci (course_id),
    CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
      PARTITION BY HASH (school_id + course_id)
      PARTITIONS 2
;

or:

CREATE TABLE foobar (
    id         int UNSIGNED NOT NULL AUTO_INCREMENT,
    school_id  int UNSIGNED NOT NULL,
    course_id  int UNSIGNED NOT NULL,
    ssname     varchar(64) NOT NULL,

    /* some other fields */

    PRIMARY KEY (id, school_id, course_id, ssname),
    INDEX idx_fb_si (school_id),
    INDEX idx_fb_ci (course_id),
    CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
      PARTITION BY KEY (school_id, course_id, ssname)
      PARTITIONS 2
;

As for the files that store tables, MySOL will create them, though it may do it when you define the table rather than when rows are inserted into it. You don't need to worry about how MySQL manages files. Remember, there are a limited number of partitions, defined when you create the table by the PARTITIONS *n* clause.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文