当 MySQL 或 PostgreSQL 中有 TEXT 字段时,是否应该将其放在单独的表中?
我听说,如果您有一个包含 TEXT 列的表,该表将保存大量文本数据,那么将该列移动到一个单独的表中并通过 JOIN 与基本记录获取它会更好地提高性能。
这是真的吗?如果是的话,为什么?
I've heard that if you have a table with a TEXT column that will hold a large chunk of text data, it's better for performance to move that column into a separate table and get it via JOINs to the base record.
Is this true, and if so why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
PostgreSQL 则不然,来自手册:
因此,大字符列(例如没有指定大小限制的
TEXT
或VARCHAR
)存储在远离主表数据的地方。因此,PostgreSQL 内置了“将其放入单独的表中”优化功能。如果您使用 PostgreSQL,请明智地安排表并将数据布局留给 PostgreSQL。我不知道 MySQL 或其他 RDBM 如何排列它们的数据。
这种优化背后的原因是,数据库通常会将每行的数据保留在磁盘上的连续块中,以减少需要读取或更新该行时的查找次数。如果一行中有一个 TEXT(或其他可变长度类型)列,则该行的大小是可变的,因此需要更多的工作来进行行与行之间的转换。打个比方,访问链表中的某些内容与访问数组之间的区别;对于链表,您必须一次读取三个元素才能到达第四个元素,而对于数组,您只需从开头偏移
3 * element_size
字节即可,一步到位。Not with PostgreSQL, from the manual:
So a large character column (such as
TEXT
orVARCHAR
without a specified size limit) is stored away from the main table data. So, PostgreSQL has your "put it in a separate table" optimization built in. If you're using PostgreSQL, arrange your table sensibly and leave the data layout to PostgreSQL.I don't know how MySQL or other RDBMs arrange their data.
The reason behind this optimization is that the database will usually keep the data for each row in contiguous blocks on disk to cut down on seeking when the row needs to be read or updated. If you have a TEXT (or other variable length type) column in a row then the size of the row is variable so more work is needed to go from row to row. An analogy would be the difference between accessing something in a linked list versus accessing an array; with a linked list, you have to read three elements one at a time to get to the fourth one, with an array you just offset
3 * element_size
bytes from the beginning and you're there in one step.来自 MySQL 手册:
From the MySQL Manual:
在某些情况下,这可能是正确的。原因是,假设您的表是:
然后您执行如下查询:
如果其中没有
text_content
字段,您将必须从磁盘加载更多的页面桌子。查询优化主要是将磁盘 I/O 减少到尽可能少。In some scenarios, this might be true. The reason is that let's say your table is:
Then you do a query like this:
You will have to load much more pages from disk that you would have if you didn't have the
text_content
field in that table. And query optimization is most about reducing disk I/O to the minimum possible.