我对一个非常大的表(近 3000 万行)上的 ALTER TABLE
命令有疑问。
其中一列是 varchar(255)
,我想将其大小调整为 varchar(40)
。
基本上,我想通过运行以下命令来更改我的列:
ALTER TABLE mytable ALTER COLUMN mycolumn TYPE varchar(40);
如果过程很长,我没有问题,但在 ALTER TABLE 命令期间我的表似乎不再可读。
有更聪明的方法吗?也许添加一个新列,从旧列复制值,删除旧列,最后重命名新列?
注意:我使用 PostgreSQL 9.0。
I have a question about the ALTER TABLE
command on a really large table (almost 30 millions rows).
One of its columns is a varchar(255)
and I would like to resize it to a varchar(40)
.
Basically, I would like to change my column by running the following command:
ALTER TABLE mytable ALTER COLUMN mycolumn TYPE varchar(40);
I have no problem if the process is very long but it seems my table is no more readable during the ALTER TABLE
command.
Is there a smarter way? Maybe add a new column, copy values from the old column, drop the old column and finally rename the new one?
Note: I use PostgreSQL 9.0.
发布评论
评论(9)
在 PostgreSQL 9.1 中,有一种更简单的方法
http://www.postgresql.org/message-id/ [电子邮件受保护]
In PostgreSQL 9.1 there is an easier way
http://www.postgresql.org/message-id/[email protected]
有关于如何执行此操作的说明 在不更改数据的情况下调整 PostgreSQL 表中列的大小。您必须破解数据库目录数据。正式执行此操作的唯一方法是使用 ALTER TABLE,正如您所注意到的,更改将在运行时锁定并重写整个表。
在更改之前,请务必阅读文档的字符类型部分这。这里有各种奇怪的情况需要注意。当值存储到行中时,将完成长度检查。如果您设置一个下限,则根本不会减少现有值的大小。明智的做法是扫描整个表,查找更改后字段长度大于 40 个字符的行。您需要弄清楚如何手动截断它们——这样您就可以在超大的锁上恢复一些锁——因为如果有人尝试更新该行上的任何内容,那么此时它会因为太大而被拒绝它用于存储该行的新版本。随之而来的是用户的欢闹。
VARCHAR 是一种糟糕的类型,它存在于 PostgreSQL 中只是为了遵守 SQL 标准中与其相关的糟糕部分。如果您不关心多数据库兼容性,请考虑将数据存储为 TEXT 并添加约束来限制其长度。您可以更改约束,而不会出现表锁定/重写问题,并且它们可以执行更多的完整性检查,而不仅仅是弱长度检查。
There's a description of how to do this at Resize a column in a PostgreSQL table without changing data. You have to hack the database catalog data. The only way to do this officially is with ALTER TABLE, and as you've noted that change will lock and rewrite the entire table while it's running.
Make sure you read the Character Types section of the docs before changing this. All sorts of weird cases to be aware of here. The length check is done when values are stored into the rows. If you hack a lower limit in there, that will not reduce the size of existing values at all. You would be wise to do a scan over the whole table looking for rows where the length of the field is >40 characters after making the change. You'll need to figure out how to truncate those manually--so you're back some locks just on oversize ones--because if someone tries to update anything on that row it's going to reject it as too big now, at the point it goes to store the new version of the row. Hilarity ensues for the user.
VARCHAR is a terrible type that exists in PostgreSQL only to comply with its associated terrible part of the SQL standard. If you don't care about multi-database compatibility, consider storing your data as TEXT and add a constraint to limits its length. Constraints you can change around without this table lock/rewrite problem, and they can do more integrity checking than just the weak length check.
好吧,我可能迟到了,但是......
没有必要根据您的情况调整列的大小!
Postgres 与其他一些数据库不同,它足够聪明,只使用足够的空间来容纳字符串(甚至对较长的字符串使用压缩),因此即使您的列声明为 VARCHAR(255) - 如果您将 40 个字符的字符串存储在该列,空间使用量将为 40 个字节 + 1 个字节的开销。
(http://www.postgresql.org/docs/9.0/interactive/ datatype-character.html)
VARCHAR 中的大小规范仅用于检查插入的值的大小,它不会影响磁盘布局。事实上,VARCHAR 和 TEXT 字段在 Postgres 中以相同的方式存储< /a>.
Ok, I'm probably late to the party, BUT...
THERE'S NO NEED TO RESIZE THE COLUMN IN YOUR CASE!
Postgres, unlike some other databases, is smart enough to only use just enough space to fit the string (even using compression for longer strings), so even if your column is declared as VARCHAR(255) - if you store 40-character strings in the column, the space usage will be 40 bytes + 1 byte of overhead.
(http://www.postgresql.org/docs/9.0/interactive/datatype-character.html)
The size specification in VARCHAR is only used to check the size of the values which are inserted, it does not affect the disk layout. In fact, VARCHAR and TEXT fields are stored in the same way in Postgres.
我遇到了同样的问题,试图将 VARCHAR 从 32 截断为 8,并得到
错误:值对于类型字符变化而言太长(8)
。我希望尽可能接近 SQL,因为我使用的是一种自制的类似 JPA 的结构,我们可能必须根据客户的选择切换到不同的 DBMS(PostgreSQL 是默认的)。因此,我不想使用更改系统表的技巧。我结束了在
ALTER TABLE
中使用USING
语句:正如 @raylu 指出的,
ALTER
获取表上的独占锁,因此所有其他操作都会被延迟直到完成。I was facing the same problem trying to truncate a VARCHAR from 32 to 8 and getting the
ERROR: value too long for type character varying(8)
. I want to stay as close to SQL as possible because I'm using a self-made JPA-like structure that we might have to switch to different DBMS according to customer's choices (PostgreSQL being the default one). Hence, I don't want to use the trick of altering System tables.I ended using the
USING
statement in theALTER TABLE
:As @raylu noted,
ALTER
acquires an exclusive lock on the table so all other operations will be delayed until it completes.如果将更改放入事务中,则表不应该被锁定:
这对我来说非常快,在具有超过 400k 行的表上只需几秒钟。
if you put the alter into a transaction the table should not be locked:
this worked for me blazing fast, few seconds on a table with more than 400k rows.
添加新列并将新列替换为旧列对我有用,在 redshift postgresql 上,请参阅此链接以获取更多详细信息 https: //gist.github.com/mmasashi/7107430
Adding new column and replacing new one with old worked for me, on redshift postgresql, refer this link for more details https://gist.github.com/mmasashi/7107430
这是 Greg Smith 描述的页面的缓存。如果它也消失了,alter语句看起来像这样:
如果你的表是TABLE1,列是COL1,你想将其设置为35个字符(根据链接,+4是出于遗留目的,可能是开销AH 在评论中提到)。
Here's the cache of the page described by Greg Smith. In case that dies as well, the alter statement looks like this:
Where your table is TABLE1, the column is COL1 and you want to set it to 35 characters (the +4 is needed for legacy purposes according to the link, possibly the overhead referred to by A.H. in the comments).
尝试运行以下更改表:
Try run following alter table:
我找到了一种非常简单的方法来更改大小,即注释
@Size(min = 1, max = 50)
,它是import javax.validation.constraints.Size
在休眠状态下执行此命令时,您将进入 pgAdmin III
I have found a very easy way to change the size i.e. the annotation
@Size(min = 1, max = 50)
which is part ofimport javax.validation.constraints.Size
When executing this is hibernate you get in pgAdmin III