当 MySQL 或 PostgreSQL 中有 TEXT 字段时,是否应该将其放在单独的表中?

发布于 2024-10-21 19:37:19 字数 107 浏览 1 评论 0原文

我听说,如果您有一个包含 TEXT 列的表,该表将保存大量文本数据,那么将该列移动到一个单独的表中并通过 JOIN 与基本记录获取它会更好地提高性能。

这是真的吗?如果是的话,为什么?

I've heard that if you have a table with a TEXT column that will hold a large chunk of text data, it's better for performance to move that column into a separate table and get it via JOINs to the base record.

Is this true, and if so why?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

梦里°也失望 2024-10-28 19:37:19

PostgreSQL 则不然,来自手册

非常长的值也存储在后台表中,这样它们就不会干扰对较短列值的快速访问。

因此,大字符列(例如没有指定大小限制的 TEXTVARCHAR)存储在远离主表数据的地方。因此,PostgreSQL 内置了“将其放入单独的表中”优化功能。如果您使用 PostgreSQL,请明智地安排表并将数据布局留给 PostgreSQL。

我不知道 MySQL 或其他 RDBM 如何排列它们的数据。

这种优化背后的原因是,数据库通常会将每行的数据保留在磁盘上的连续块中,以减少需要读取或更新该行时的查找次数。如果一行中有一个 TEXT(或其他可变长度类型)列,则该行的大小是可变的,因此需要更多的工作来进行行与行之间的转换。打个比方,访问链表中的某些内容与访问数组之间的区别;对于链表,您必须一次读取三个元素才能到达第四个元素,而对于数组,您只需从开头偏移 3 * element_size 字节即可,一步到位。

Not with PostgreSQL, from the manual:

Very long values are also stored in background tables so that they do not interfere with rapid access to shorter column values.

So a large character column (such as TEXT or VARCHAR without a specified size limit) is stored away from the main table data. So, PostgreSQL has your "put it in a separate table" optimization built in. If you're using PostgreSQL, arrange your table sensibly and leave the data layout to PostgreSQL.

I don't know how MySQL or other RDBMs arrange their data.

The reason behind this optimization is that the database will usually keep the data for each row in contiguous blocks on disk to cut down on seeking when the row needs to be read or updated. If you have a TEXT (or other variable length type) column in a row then the size of the row is variable so more work is needed to go from row to row. An analogy would be the difference between accessing something in a linked list versus accessing an array; with a linked list, you have to read three elements one at a time to get to the fourth one, with an array you just offset 3 * element_size bytes from the beginning and you're there in one step.

花开半夏魅人心 2024-10-28 19:37:19

来自 MySQL 手册

对于具有多列的表,
减少查询的内存需求
不使用 BLOB 列,
考虑拆分 BLOB 列
放入单独的表并引用
需要时使用连接查询。

From the MySQL Manual:

For a table with several columns, to
reduce memory requirements for queries
that do not use the BLOB column,
consider splitting the BLOB column
into a separate table and referencing
it with a join query when needed.

醉城メ夜风 2024-10-28 19:37:19

在某些情况下,这可能是正确的。原因是,假设您的表是:

create table foo (
    id serial primary key,
    title varchar(200) not null,
    pub_date datetime not null,
    text_content text
);

然后您执行如下查询:

select id, title, pub_date
  from foo;

如果其中没有 text_content 字段,您将必须从磁盘加载更多的页面桌子。查询优化主要是将磁盘 I/O 减少到尽可能少。

In some scenarios, this might be true. The reason is that let's say your table is:

create table foo (
    id serial primary key,
    title varchar(200) not null,
    pub_date datetime not null,
    text_content text
);

Then you do a query like this:

select id, title, pub_date
  from foo;

You will have to load much more pages from disk that you would have if you didn't have the text_content field in that table. And query optimization is most about reducing disk I/O to the minimum possible.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文