Python/psycopg2 中优雅的主键错误处理
使用 Python 2.7 和
[150]:psycopg2.版本 Out[150]: '2.4.2 (dt dec pq3 ext)'
我有一个简单的 python 脚本,用于处理事务并将数据写入数据库。偶尔会有一个插入违反了我的主键。这很好,我只是希望它忽略该记录并继续快乐的方式。我遇到的问题是 psycopg2 主键错误正在中止整个事务块,并且错误后的所有插入都会失败。这是一个示例错误,
ERROR: duplicate key value violates unique constraint "encounter_id_pkey"
DETAIL: Key (encounter_id)=(9012235) already exists.
这是在下一个插入中。不构成违规。
Inserting: 0163168~9024065
ERROR: current transaction is aborted, commands ignored until end of transaction block
每次插入都会重复第二个错误。这是一个简化的循环。我正在循环遍历 pandas 数据框,但它可以是任何循环。
conn = psycopg2.connect("dbname='XXXX' user='XXXXX' host='XXXX' password='XXXXX'")
cur = conn.cursor()
for i, val in df2.iteritems():
try:
cur = conn.cursor()
cur.execute("""insert into encounter_id_table (
encounter_id,current_date )
values
(%(create_date)s, %(encounter_id)s ) ;""",
'encounter_id':i.split('~')[1],
'create_date': datetime.date.today() })
cur.commit()
cur.close()
except Exception , e:
print 'ERROR:', e[0]
cur.close()
conn.close()
同样,基本思想是优雅地处理错误。英国皇家海军海军上将纳尔逊的名言是:“该死的演习直接向他们袭来”。或者在我们的例子中,该死的错误直接针对它们。”我想通过在每次插入上打开光标来重置事务块。我不想仅仅因为主键错误就必须重置连接。我缺少什么吗?
先谢谢你的时间
。
Using Python 2.7 and
In [150]: psycopg2.version
Out[150]: '2.4.2 (dt dec pq3 ext)'
I have a simple python scripts that processing transactions and writes data to a database. Occasionally there is an insert that violates my primary key. This is fine, i just want it to ignore that record and continue on it merry way. The problem I am having is that psycopg2 primary key error is aborting the entire transaction block and all inserts after the error fail. Here is an example error
ERROR: duplicate key value violates unique constraint "encounter_id_pkey"
DETAIL: Key (encounter_id)=(9012235) already exists.
This is on the next insert. not a violation.
Inserting: 0163168~9024065
ERROR: current transaction is aborted, commands ignored until end of transaction block
The Second error repeats itself for every insert. Here is a simplified loop. I am looping through a pandas data frame, but it could be any loop.
conn = psycopg2.connect("dbname='XXXX' user='XXXXX' host='XXXX' password='XXXXX'")
cur = conn.cursor()
for i, val in df2.iteritems():
try:
cur = conn.cursor()
cur.execute("""insert into encounter_id_table (
encounter_id,current_date )
values
(%(create_date)s, %(encounter_id)s ) ;""",
'encounter_id':i.split('~')[1],
'create_date': datetime.date.today() })
cur.commit()
cur.close()
except Exception , e:
print 'ERROR:', e[0]
cur.close()
conn.close()
Again the basic idea is to gracefully handle the Error. In the dictum of Admiral Nelson of the Royal Navy: "Damn the maneuvers go straight at them". Or in our case damn the Errors go straight at them." I thought by opening a cursor on every insert that I would be resetting the transaction block. I do not want to have to reset the connection just because of a primary key error. Is there something i am just missing?
Thanks before hand for your time.
John
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您应该在错误时回滚事务。
我在下面的代码中又添加了一个
try.. except..else
结构,以显示发生异常的确切位置。You should rollback transaction on error.
I've added one more
try..except..else
construction in the code bellow to show the exact place where exception will occur.首先:
CURRENT_DATE
是每个 SQL 标准以及 PostgreSQL 中的保留字。如果不使用双引号,则无法将其用作标识符。我宁愿避免这种情况,因此我在示例中使用列名称curdate
。接下来,您似乎颠倒了插入列的顺序:
应该是:
对于您的主要问题:完全避免该问题!
从 Postgres 9.5 开始,
INSERT ... ON CONFLICT DO NOTHING
是正确的选择。这也避免了并发写入负载下的竞争条件:请参阅:
在 Python 语法中,应该是:
在旧版本中,您大部分可以通过检查之前表中是否已存在键来避免该问题插入它:
不过,在并发写入负载下可能会失败。
First of all:
CURRENT_DATE
is a reserved word in every SQL standard as well as in PostgreSQL. You cannot use it as identifier without double-quoting it. I'd rather avoid that, so I use the column namecurdate
in my example.Next, you seem to have reversed the order of your insert-columns:
Should be:
To your main question: Avoid the problem altogether!
Since Postgres 9.5
INSERT ... ON CONFLICT DO NOTHING
is the way to go. This also avoids race conditions under concurrent write load:See:
In Python syntax, that should be:
In older versions you can mostly avoid the problem by checking if a key is already in the table before inserting it:
Can fail under concurrent write load, though.