Postgres:如果尚不存在则插入
我正在使用 Python 写入 postgres 数据库:
sql_string = "INSERT INTO hundred (name,name_slug,status) VALUES ("
sql_string += hundred + ", '" + hundred_slug + "', " + status + ");"
cursor.execute(sql_string)
但是因为我的一些行是相同的,所以我收到以下错误:
psycopg2.IntegrityError: duplicate key value
violates unique constraint "hundred_pkey"
如何编写“INSERT 除非此行已存在”SQL 语句?
我见过推荐这样的复杂语句:
IF EXISTS (SELECT * FROM invoices WHERE invoiceid = '12345')
UPDATE invoices SET billed = 'TRUE' WHERE invoiceid = '12345'
ELSE
INSERT INTO invoices (invoiceid, billed) VALUES ('12345', 'TRUE')
END IF
但首先,这对于我的需要来说是否太过分了,其次,我如何将其中一个作为简单的字符串执行?
I'm using Python to write to a postgres database:
sql_string = "INSERT INTO hundred (name,name_slug,status) VALUES ("
sql_string += hundred + ", '" + hundred_slug + "', " + status + ");"
cursor.execute(sql_string)
But because some of my rows are identical, I get the following error:
psycopg2.IntegrityError: duplicate key value
violates unique constraint "hundred_pkey"
How can I write an 'INSERT unless this row already exists' SQL statement?
I've seen complex statements like this recommended:
IF EXISTS (SELECT * FROM invoices WHERE invoiceid = '12345')
UPDATE invoices SET billed = 'TRUE' WHERE invoiceid = '12345'
ELSE
INSERT INTO invoices (invoiceid, billed) VALUES ('12345', 'TRUE')
END IF
But firstly, is this overkill for what I need, and secondly, how can I execute one of those as a simple string?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(21)
Postgres 9.5(自 2016 年 1 月 7 日发布)提供了一个 "upsert" 命令,也称为INSERT 的 ON CONFLICT 子句:
它解决了使用并发操作时可能遇到的许多微妙问题,这是其他一些答案提出的。
Postgres 9.5 (released since 2016-01-07) offers an "upsert" command, also known as an ON CONFLICT clause to INSERT:
It solves many of the subtle problems you can run into when using concurrent operation, which some other answers propose.
在 PostgreSQL 中有一种执行条件 INSERT 的好方法:
CAVEAT 不过,这种方法对于并发写入操作来说并不是 100% 可靠。
NOT EXISTS
反半连接中的SELECT
与INSERT
本身之间存在非常微小的竞争条件。在这种情况下它可能失败。There is a nice way of doing conditional INSERT in PostgreSQL:
CAVEAT This approach is not 100% reliable for concurrent write operations, though. There is a very tiny race condition between the
SELECT
in theNOT EXISTS
anti-semi-join and theINSERT
itself. It can fail under such conditions.一种方法是创建一个非约束(无唯一索引)表来将所有数据插入其中,并执行与该表不同的选择以将插入插入到一百个表中。
水平就这么高。我假设在我的示例中所有三列都是不同的,因此对于步骤 3,将 NOT EXISTS 连接更改为仅连接百个表中的唯一列。
创建临时表。请参阅文档此处。< /p>
将数据插入临时表。
将任何索引添加到临时表。
执行主表插入。
One approach would be to create a non-constrained (no unique indexes) table to insert all your data into and do a select distinct from that to do your insert into your hundred table.
So high level would be. I assume all three columns are distinct in my example so for step3 change the NOT EXISTS join to only join on the unique columns in the hundred table.
Create temporary table. See docs here.
INSERT Data into temp table.
Add any indexes to the temp table.
Do main table insert.
这正是我面临的问题,我的版本是 9.5
我用下面的 SQL 查询解决了它。
希望这对版本 >= 9.5 遇到相同问题的人有所帮助。
感谢您的阅读。
This is exactly the problem I face and my version is 9.5
And I solve it with SQL query below.
Hope that will help someone who has the same issue with version >= 9.5.
Thanks for reading.
不幸的是,
PostgreSQL
既不支持MERGE
也不支持ON DUPLICATE KEY UPDATE
,因此您必须用两个语句来完成此操作:您可以将其包装为一个函数:
然后调用它:
Unfortunately,
PostgreSQL
supports neitherMERGE
norON DUPLICATE KEY UPDATE
, so you'll have to do it in two statements:You can wrap it into a function:
and just call it:
您可以使用 VALUES - 在 Postgres 中可用:
You can make use of VALUES - available in Postgres:
我知道这个问题是不久前提出的,但我认为这可能会对某人有所帮助。我认为最简单的方法是通过触发器。例如:
从 psql 提示符执行此代码(或者您喜欢直接在数据库上执行查询)。然后您可以像平常一样从 Python 插入。例如:
请注意,正如 @Thomas_Wouters 已经提到的,上面的代码利用了参数而不是连接字符串。
I know this question is from a while ago, but thought this might help someone. I think the easiest way to do this is via a trigger. E.g.:
Execute this code from a psql prompt (or however you like to execute queries directly on the database). Then you can insert as normal from Python. E.g.:
Note that as @Thomas_Wouters already mentioned, the code above takes advantage of parameters rather than concatenating the string.
有一种在 PostgreSQL 中使用WITH查询执行条件插入的好方法:
喜欢:
There is a nice way of doing conditional INSERT in PostgreSQL using WITH query:
Like:
我们可以使用 upsert 来简化查询
we can simplify the query using upsert
INSERT .. WHERE NOT EXISTS 是个好方法。并且可以通过事务“信封”来避免竞争条件:
INSERT .. WHERE NOT EXISTS is good approach. And race conditions can be avoided by transaction "envelope":
您还可以使用 Postgres 15 中引入的 MERGE
不过,并不那么简单作为插入……在冲突时什么也不做
You can also use MERGE introduced in Postgres 15
Nevertheless, not as straightforward as INSERT … ON CONFLICT DO NOTHING
规则很简单:
但是并发写入会失败......
It's easy with rules:
But it fails with concurrent writes ...
获得最多赞成票的方法(来自 John Doe)确实对我有用,但在我的例子中,从预期的 422 行中我只得到了 180 行。
我找不到任何错误,也没有任何错误,所以我寻找一种不同的简单方法。
在
SELECT
之后使用IF NOT FOUND THEN
对我来说非常适合。(PostgreSQL 文档中描述)
文档中的示例:
The approach with the most upvotes (from John Doe) does somehow work for me but in my case from expected 422 rows i get only 180.
I couldn't find anything wrong and there are no errors at all, so i looked for a different simple approach.
Using
IF NOT FOUND THEN
after aSELECT
just works perfectly for me.(described in PostgreSQL Documentation)
Example from documentation:
psycopgs 游标类具有属性 rowcount。
因此,您可以先尝试 UPDATE,然后仅在 rowcount 为 0 时尝试 INSERT。
但是,根据数据库中的活动级别,您可能会遇到 UPDATE 和 INSERT 之间的竞争条件,其中另一个进程可能会在此期间创建该记录。
psycopgs cursor class has the attribute rowcount.
So you could try UPDATE first and INSERT only if rowcount is 0.
But depending on activity levels in your database you may hit a race condition between UPDATE and INSERT where another process may create that record in the interim.
您的列“hundred”似乎被定义为主键,因此必须是唯一的,但事实并非如此。问题不在于,而在于您的数据。
我建议你插入一个 id 作为序列类型来处理主键
Your column "hundred" seems to be defined as primary key and therefore must be unique which is not the case. The problem isn't with, it is with your data.
I suggest you insert an id as serial type to handly the primary key
如果您说许多行都是相同的,您将多次结束检查。您可以发送它们,数据库将使用 ON CONFLICT 子句确定是否插入它,如下所示
If you say that many of your rows are identical you will end checking many times. You can send them and the database will determine if insert it or not with the ON CONFLICT clause as follows
在 Postgres 9.5 或更高版本中,您可以使用
ON CONFLICT
来避免像上面提到的 @Arie 这样的约束错误。要了解与此INSERT
查询相关的更多选项,请参阅 Postgres 文档。另一种解决方案是使用 try/catch 来处理运行时错误
In Postgres version 9.5 or higher you can use
ON CONFLICT
to avoid errors of contraints like @Arie mentioned above. To know more options related to thisINSERT
query refer to Postgres Docs.An alternative solution is by using try/catch to handle runtime errors
我一直在寻找类似的解决方案,试图找到适用于 PostgreSQL 和 HSQLDB 的 SQL。 (HSQLDB 就是让这变得困难的原因。)以您的示例为基础,这是我在其他地方找到的格式。
I was looking for a similar solution, trying to find SQL that work work in PostgreSQL as well as HSQLDB. (HSQLDB was what made this difficult.) Using your example as a basis, this is the format that I found elsewhere.
这是一个通用的 python 函数,给定表名、列和值,生成 postgresql 的 upsert 等效项。
导入 json
Here is a generic python function that given a tablename, columns and values, generates the upsert equivalent for postgresql.
import json
解决方案很简单,但不是立即解决。
如果您想使用此指令,则必须对 db:
进行一项更改,
这些更改后“INSERT”将正常工作。
The solution in simple, but not immediatly.
If you want use this instruction, you must make one change to the db:
after these changes "INSERT" will work correctly.