展开 varchar 列非常慢,为什么?

发布于 2024-10-17 08:45:18 字数 1342 浏览 0 评论 0原文

你好 我们需要修改一个大的产品表的某一列,通常是普通的ddl语句 速度非常快,但是上面的ddl statmens需要大约10分钟。我不知道原因! 我只是想扩展一个varchar列。以下是详细信息

--table size
wapreader_log=> select pg_size_pretty(pg_relation_size('log_foot_mark'));
 pg_size_pretty 
----------------
 5441 MB
(1 row)


--table ddl
wapreader_log=> \d log_foot_mark
          Table "wapreader_log.log_foot_mark"
   Column    |            Type             | Modifiers 
-------------+-----------------------------+-----------
 id          | integer                     | not null
 create_time | timestamp without time zone | 
 sky_id      | integer                     | 
 url         | character varying(1000)     | 
 refer_url   | character varying(1000)     | 
 source      | character varying(64)       | 
 users       | character varying(64)       | 
 userm       | character varying(64)       | 
 usert       | character varying(64)       | 
 ip          | character varying(32)       | 
 module      | character varying(64)       | 
 resource_id | character varying(100)      | 
 user_agent  | character varying(128)      | 
Indexes:
    "pk_log_footmark" PRIMARY KEY, btree (id)


--alter column
wapreader_log=> \timing
Timing is on.

wapreader_log=>  ALTER TABLE wapreader_log.log_foot_mark ALTER column user_agent TYPE character varying(256); 
ALTER TABLE
Time: 603504.835 ms    

Hi
We need to modify a column of a big product table , usually normall ddl statments will be
excutely fast ,but the above ddl statmens takes about 10 minnutes。I wonder know the reason!
I just want to expand a varchar column。The following is the detailsl

--table size
wapreader_log=> select pg_size_pretty(pg_relation_size('log_foot_mark'));
 pg_size_pretty 
----------------
 5441 MB
(1 row)


--table ddl
wapreader_log=> \d log_foot_mark
          Table "wapreader_log.log_foot_mark"
   Column    |            Type             | Modifiers 
-------------+-----------------------------+-----------
 id          | integer                     | not null
 create_time | timestamp without time zone | 
 sky_id      | integer                     | 
 url         | character varying(1000)     | 
 refer_url   | character varying(1000)     | 
 source      | character varying(64)       | 
 users       | character varying(64)       | 
 userm       | character varying(64)       | 
 usert       | character varying(64)       | 
 ip          | character varying(32)       | 
 module      | character varying(64)       | 
 resource_id | character varying(100)      | 
 user_agent  | character varying(128)      | 
Indexes:
    "pk_log_footmark" PRIMARY KEY, btree (id)


--alter column
wapreader_log=> \timing
Timing is on.

wapreader_log=>  ALTER TABLE wapreader_log.log_foot_mark ALTER column user_agent TYPE character varying(256); 
ALTER TABLE
Time: 603504.835 ms    

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

欢烬 2024-10-24 08:45:18

ALTER ... TYPE 需要完整的表重写,这就是为什么在大型表上可能需要一些时间才能完成。如果不需要长度约束,则不要使用该约束。一劳永逸地放弃这些约束,你将永远不会因为过时的约束而遇到新的问题。只需使用 TEXT 或 VARCHAR。

ALTER ... TYPE requires a complete table rewrite, that's why it might take some time to complete on large tables. If you don't need a length constraint, than don't use the constraint. Drop these constraints once and and for all, and you will never run into new problems because of obsolete constraints. Just use TEXT or VARCHAR.

梦毁影碎の 2024-10-24 08:45:18

当您更改表时,PostgreSQL 必须确保旧版本在某些情况下不会消失,以便在服务器在提交和/或写入磁盘之前崩溃时允许回滚更改。出于这些原因,即使对于看似微不足道的更改,它实际上在这里所做的也是首先在其他地方写出该表的全新副本。完成后,它就会切换到新的。请注意,发生这种情况时,您还需要足够的磁盘空间来保存两个副本。

有些类型的 DDL 更改无需创建表的第二个副本即可进行,但这不是其中之一。例如,您可以快速添加默认为 NULL 的新列。但是添加具有非 NULL 默认值的新列需要创建一个新副本。

When you alter a table, PostgreSQL has to make sure the old version doesn't go away in some cases, to allow rolling back the change if the server crashes before it's committed and/or written to disk. For those reasons, what it actually does here even on what seems to be a trivial change is write out a whole new copy of the table somewhere else first. When that's finished, it then swaps over to the new one. Note that when this happens, you'll need enough disk space to hold both copies as well.

There are some types of DDL changes that can be made without making a second copy of the table, but this is not one of them. For example, you can add a new column that defaults to NULL quickly. But adding a new column with a non-NULL default requires making a new copy instead.

如何视而不见 2024-10-24 08:45:18

避免表重写的一种方法是在表中使用 SQL 域(请参阅 CREATE DOMAIN)而不是 varchar。然后,您可以添加和删除域上的约束。

请注意,这也不会立即起作用,因为使用该域的所有表都会检查约束有效性,但它比全表重写更便宜,并且不需要额外的磁盘空间。

One way to avoid a table rewrite is to use SQL domains (see CREATE DOMAIN) instead of varchars in your table. You can then add and remove constraints on a domain.

Note that this does not work instantly either, since all tables using the domain are checked for constraint validity, but it is less expensive than full table rewrite and it doesn't need the extra disk space.

与往事干杯 2024-10-24 08:45:18

不确定这是否更快,但可能您必须对其进行测试。
尝试这样做,直到 PostgreSQL 可以处理您想要的更改类型,而无需重写整个臭表。

ALTER TABLE log_foot_mark RENAME refer_url TO refer_url_old;
ALTER TABLE log_foot_mark ADD COLUMN refer_url character varying(256);

然后使用表的索引主键或唯一键执行循环事务。我认为你必须通过 Perl 或某种语言来完成此操作,你可以在每个循环迭代中进行提交。

WHILE (end < MAX_RECORDS)LOOP

BEGIN TRANSACTION;
UPDATE log_foot_mark
SET refer_url = refer_url_old
WHERE id >= start AND id <= end;

COMMIT TRANSACTION;
END LOOP;

ALTER TABLE log_foot_mark DROP COLUMN refer_url_old;

请记住,循环逻辑需要位于 PL\PGSQL 之外的其他内容中才能使其提交每个循环迭代。完全不进行循环测试,并以 10k、20k、30k 等交易大小进行循环,直到找到最佳点。

Not sure if this is any faster, but it may be you will have to test it out.
Try this until PostgreSQL can handle the type of alter you want without re-writing the entire stinking table.

ALTER TABLE log_foot_mark RENAME refer_url TO refer_url_old;
ALTER TABLE log_foot_mark ADD COLUMN refer_url character varying(256);

Then using the indexed primary key or unique key of the table do a looping transaction. I think you will have to do this via Perl or some language that you can do a commit every loop iteration.

WHILE (end < MAX_RECORDS)LOOP

BEGIN TRANSACTION;
UPDATE log_foot_mark
SET refer_url = refer_url_old
WHERE id >= start AND id <= end;

COMMIT TRANSACTION;
END LOOP;

ALTER TABLE log_foot_mark DROP COLUMN refer_url_old;

Keep in mind that loop logic will need to be in something other than PL\PGSQL to get it to commit every loop iteration. Test it with no loop at all and looping with a transaction size of 10k 20k 30k etc until you find the sweet spot.

杀お生予夺 2024-10-24 08:45:18

此问题不再有效。

因为至少 Postgres v7.2.8 (2005) varchar(n) 字段存储为 4 个字节加上实际的压缩字符串。 https://www.postgresql.org/docs/7.2/datatype-character。 html。后来的版本还引入了一个短字符串,它存储为 1 个字节加上实际字符串。因此,与character(n)不同,varchar(n)字段的最大长度不会影响其实际存储。因此,增加 varchar(n) 字段的最大大小不需要重写表。

可能是在原始帖子发表时,当 varchar(n) 的最大大小更改时,Postgres 确实(不必要地?)保存了表的副本。然而,至少从 Postgres v9.3 (2013) 开始它就没有了。

This issue is no longer valid.

Since at least Postgres v7.2.8 (2005) varchar(n) fields were stored as 4 bytes plus the actual compressed string. https://www.postgresql.org/docs/7.2/datatype-character.html. Later versions also introduced a short string which is stored as 1 byte plus the actual string. So, unlike character(n), the maximum length of a varchar(n) field does not affect its actual storage. So increasing the maximum size of a varchar(n) field does not require a table rewrite.

It may be that at the time of the original post Postgres did (needlessly?) save a copy of the table when the maximum size of a varchar(n) was altered. However, since at least Postgres v9.3 (2013) it does not.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文