需要更新包含数百万行的 MySQL 表
我这里有两个表:
Cities
Region| City Name
States
ID| State | Region_Key
我需要对城市表进行更新查询,如下所示,即设置 cars.region = statres.id 其中 states.region_key = cars.region
问题是城市数据库有超过 270 万条记录,我尝试做这样的查询只是为了 mysql 挂起并消失。
update cities c, states c set c.region = s.id where c.region = s.region_key
编辑===================
这是我正在使用的sql,但它不起作用我收到一条错误,指出 UPDATE 和 LIMIT 的使用不正确
update cities w, states s
set w.region_id = s.id,
w.updated = 1
where w.region = s.w_code and w.updated = 0
LIMIT 10000
I have two table shere:
Cities
Region| City Name
States
ID| State | Region_Key
I need to do an update query on the cities table like so i.e set cities.region = statres.id where states.region_key = cities.region
The problem is that the cities database has over 2.7 million records and I tried doing a query like this only for mysql to hang and die out.
update cities c, states c set c.region = s.id where c.region = s.region_key
EDIT ===================
This is the sql I am using but its not working I get an error saying incorrect usage of UPDATE and LIMIT
update cities w, states s
set w.region_id = s.id,
w.updated = 1
where w.region = s.w_code and w.updated = 0
LIMIT 10000
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
cities
表Set c.HasBeenUpdated = 1
添加到更新子句AND c.HasBeenUpdated IS NULL
WHERE
条件AND c.ID in (Select ID from Cities where HasBeenUpdated Is Null Limit 10000)
。这是必需的,因为您不能在多表Update
上使用Limit
语句(来源)。这还假设您有一个 ID 列作为城市的 PK(如果没有,则考虑添加一列)。现在,更新语句一次只会处理 10,000 行(并且只会处理未处理的行)。如果您可以使用应用程序逻辑将其放入循环中,则可以将其用于自动化。根据您的需要更改限制数量,完成后,删除
HasBeenUpdated
列。这应该允许您最大限度地减少更新对表和数据库的影响,并允许您以可管理的批次在整个表中执行更新。
编辑:更新步骤4以过滤要通过子查询更新的行,因为Limit语句不能用于多表更新。
cities
tableSet c.HasBeenUpdated = 1
to the update clauseAND c.HasBeenUpdated IS NULL
WHERE
conditionAND c.ID in (Select ID from Cities where HasBeenUpdated Is Null Limit 10000)
. This is needed because you cannot use aLimit
statement on a multi-tableUpdate
(source). This also presumes that you have an ID column as the PK for cities (if not, then consider adding one). Now the update statement will only process 10,000 rows at a time (and will only process unprocessed rows).If you can put this in a loop using your application logic, then this can be used for automation. Change the limit number based on your needs and when it is done, remove the
HasBeenUpdated
column.This should allow you to minimize the impact of the update on the table and database, and allow you to perform it across the whole table in manageable batches.
Edit: Update step 4 to filter rows to be updated via subquery, since a Limit statement cannot be used on a multi-table Update.
使用 SELECT INTO NEW_TABLE 创建一个包含所需内容的新表,然后删除/重命名旧表并使用 RENAME TABLE 将新创建的表重命名为正确的名称:
Use
SELECT INTO NEW_TABLE
to create a new table with desired content, and then drop/rename older table and useRENAME TABLE
to rename newely created table into proper name: