Python/psycopg2 中优雅的主键错误处理

发布于 2024-12-21 05:10:53 字数 1373 浏览 3 评论 0原文

使用 Python 2.7 和

[150]:psycopg2.版本 Out[150]: '2.4.2 (dt dec pq3 ext)'

我有一个简单的 python 脚本,用于处理事务并将数据写入数据库。偶尔会有一个插入违反了我的主键。这很好,我只是希望它忽略该记录并继续快乐的方式。我遇到的问题是 psycopg2 主键错误正在中止整个事务块,并且错误后的所有插入都会失败。这是一个示例错误,

ERROR: duplicate key value violates unique constraint "encounter_id_pkey"
DETAIL:  Key (encounter_id)=(9012235) already exists.

这是在下一个插入中。不构成违规。

Inserting: 0163168~9024065
ERROR: current transaction is aborted, commands ignored until end of transaction block

每次插入都会重复第二个错误。这是一个简化的循环。我正在循环遍历 pandas 数据框,但它可以是任何循环。

conn = psycopg2.connect("dbname='XXXX' user='XXXXX' host='XXXX' password='XXXXX'")

cur = conn.cursor()

for i, val in df2.iteritems():
    try:
        cur = conn.cursor()
        cur.execute("""insert into encounter_id_table (
        encounter_id,current_date  )
        values       
        (%(create_date)s, %(encounter_id)s ) ;""",
        'encounter_id':i.split('~')[1],  
        'create_date': datetime.date.today() })           
        cur.commit()
        cur.close() 
    except Exception , e:
        print 'ERROR:', e[0]
        cur.close()
 conn.close()   

同样,基本思想是优雅地处理错误。英国皇家海军海军上将纳尔逊的名言是:“该死的演习直接向他们袭来”。或者在我们的例子中,该死的错误直接针对它们。”我想通过在每次插入上打开光标来重置事务块。我不想仅仅因为主键错误就必须重置连接。我缺少什么吗?

先谢谢你的时间

Using Python 2.7 and

In [150]: psycopg2.version
Out[150]: '2.4.2 (dt dec pq3 ext)'

I have a simple python scripts that processing transactions and writes data to a database. Occasionally there is an insert that violates my primary key. This is fine, i just want it to ignore that record and continue on it merry way. The problem I am having is that psycopg2 primary key error is aborting the entire transaction block and all inserts after the error fail. Here is an example error

ERROR: duplicate key value violates unique constraint "encounter_id_pkey"
DETAIL:  Key (encounter_id)=(9012235) already exists.

This is on the next insert. not a violation.

Inserting: 0163168~9024065
ERROR: current transaction is aborted, commands ignored until end of transaction block

The Second error repeats itself for every insert. Here is a simplified loop. I am looping through a pandas data frame, but it could be any loop.

conn = psycopg2.connect("dbname='XXXX' user='XXXXX' host='XXXX' password='XXXXX'")

cur = conn.cursor()

for i, val in df2.iteritems():
    try:
        cur = conn.cursor()
        cur.execute("""insert into encounter_id_table (
        encounter_id,current_date  )
        values       
        (%(create_date)s, %(encounter_id)s ) ;""",
        'encounter_id':i.split('~')[1],  
        'create_date': datetime.date.today() })           
        cur.commit()
        cur.close() 
    except Exception , e:
        print 'ERROR:', e[0]
        cur.close()
 conn.close()   

Again the basic idea is to gracefully handle the Error. In the dictum of Admiral Nelson of the Royal Navy: "Damn the maneuvers go straight at them". Or in our case damn the Errors go straight at them." I thought by opening a cursor on every insert that I would be resetting the transaction block. I do not want to have to reset the connection just because of a primary key error. Is there something i am just missing?

Thanks before hand for your time.

John

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

鲜血染红嫁衣 2024-12-28 05:10:53

您应该在错误时回滚事务。

我在下面的代码中又添加了一个 try.. except..else 结构,以显示发生异常的确切位置。

try:
    cur = conn.cursor()

    try:
        cur.execute("""insert into encounter_id_table (
            encounter_id,current_date  )
            values       
            (%(create_date)s, %(encounter_id)s ) ;""",
            'encounter_id':i.split('~')[1],  
            'create_date': datetime.date.today() })
    except psycopg2.IntegrityError:
        conn.rollback()
    else:
        conn.commit()

    cur.close() 
except Exception , e:
    print 'ERROR:', e[0]

You should rollback transaction on error.

I've added one more try..except..else construction in the code bellow to show the exact place where exception will occur.

try:
    cur = conn.cursor()

    try:
        cur.execute("""insert into encounter_id_table (
            encounter_id,current_date  )
            values       
            (%(create_date)s, %(encounter_id)s ) ;""",
            'encounter_id':i.split('~')[1],  
            'create_date': datetime.date.today() })
    except psycopg2.IntegrityError:
        conn.rollback()
    else:
        conn.commit()

    cur.close() 
except Exception , e:
    print 'ERROR:', e[0]
入怼 2024-12-28 05:10:53

首先:CURRENT_DATE是每个 SQL 标准以及 PostgreSQL 中的保留字。如果不使用双引号,则无法将其用作标识符。我宁愿避免这种情况,因此我在示例中使用列名称 curdate

接下来,您似乎颠倒了插入列的顺序:

(%(create_date)s, %(encounter_id)s )

应该是:

( %(encounter_id)s, %(create_date)s)

对于您的主要问题:完全避免该问题!

从 Postgres 9.5 开始,INSERT ... ON CONFLICT DO NOTHING 是正确的选择。这也避免了并发写入负载下的竞争条件:

INSERT INTO encounter_id_table (encounter_id, curdate)
VALUES (1234, CURRENT_DATE)
ON CONFLICT DO NOTHING;

请参阅:

在 Python 语法中,应该是:

cur.execute("""INSERT INTO encounter_id_table (encounter_id, curdate)
    VALUES (%(encounter_id)s, CURRENT_DATE);""",
  {'encounter_id':i.split('~')[1]})       

在旧版本中,您大部分可以通过检查之前表中是否已存在键来避免该问题插入它:

INSERT INTO encounter_id_table (encounter_id, curdate)
SELECT 1234, now()::date
WHERE  NOT EXISTS (SELECT FROM encounter_id_table t
                   WHERE t.encounter_id = 1234);

不过,在并发写入负载下可能会失败。

First of all: CURRENT_DATE is a reserved word in every SQL standard as well as in PostgreSQL. You cannot use it as identifier without double-quoting it. I'd rather avoid that, so I use the column name curdate in my example.

Next, you seem to have reversed the order of your insert-columns:

(%(create_date)s, %(encounter_id)s )

Should be:

( %(encounter_id)s, %(create_date)s)

To your main question: Avoid the problem altogether!

Since Postgres 9.5 INSERT ... ON CONFLICT DO NOTHING is the way to go. This also avoids race conditions under concurrent write load:

INSERT INTO encounter_id_table (encounter_id, curdate)
VALUES (1234, CURRENT_DATE)
ON CONFLICT DO NOTHING;

See:

In Python syntax, that should be:

cur.execute("""INSERT INTO encounter_id_table (encounter_id, curdate)
    VALUES (%(encounter_id)s, CURRENT_DATE);""",
  {'encounter_id':i.split('~')[1]})       

In older versions you can mostly avoid the problem by checking if a key is already in the table before inserting it:

INSERT INTO encounter_id_table (encounter_id, curdate)
SELECT 1234, now()::date
WHERE  NOT EXISTS (SELECT FROM encounter_id_table t
                   WHERE t.encounter_id = 1234);

Can fail under concurrent write load, though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文