Pandas DataFrame.to_sql 消耗/省略一个额外的自动生成的 id

发布于 2025-01-09 01:55:05 字数 1743 浏览 0 评论 0原文

我在 Python 脚本（Python 版本：3.9.1）中使用具有自动递增主键的 Pandas DataFrame.to_sql 函数观察到奇怪的行为。使用 (A) 创建数据库表并执行 Python 代码 (B) 后，正如预期的那样，两行被插入到数据库表中，但是，使用命令 (C) 检索下一个 id 显示 4，而不是预期的 3。事实上，执行代码 (B) 3 次会产生下表：

+----+------+------+
| id | a    | b    |
+----+------+------+
|  1 |    1 |    3 |
|  2 |    2 |    4 |
|  4 |    1 |    3 |
|  5 |    2 |    4 |
|  7 |    1 |    3 |
|  8 |    2 |    4 |
+----+------+------+

不知何故缺少 id 3 和 6。

我在某个时刻检索最大 id，并向其添加行数以获得新的最大 id，其相差 1。如果预期出现这种行为，则仅添加一个是没有问题的，但我在文档。

有人可以阐明此行为或解释下面的示例代码在何处导致此问题吗？谢谢。

环境：

Python：3.9.1
Mariadb：15.1
Aqlalchemy：1.4.23）
Pandas：1.4.1

代码：

（A）数据库表创建：

create table if not exists test_table (id INT auto_increment primary key, a int, b int);

（B）Python代码：

import pandas as pd
import sqlalchemy as sqla

engine = sqla.create_engine(
    'mariadb+mariadbconnector://user:[email protected]:3306/unit_test',
    echo=True,
    future=True
)

df = pd.DataFrame({'a' : [1,2], 'b': [3,4]})

_ = df.to_sql(
    'test_table',
    con=engine,
    if_exists='append',
    index=False,
    chunksize=500,
    dtype={
        'a': sqla.Integer,
        'b': sqla.Integer
    }

)

（C）检索下一个ID：

select auto_increment from information_schema.tables where table_name='test_table';

原文

I am observing a strange behavior using Pandas DataFrame.to_sql function with auto incremented primary keys in a Python script (Python version: 3.9.1). After creating a database table using (A) and executing Python code (B), as expected two rows are inserted into the database table, however, retrieving the next id using command (C) shows 4 instead of the expected 3. In fact, executing code (B) three times results in the following table:

+----+------+------+
| id | a    | b    |
+----+------+------+
|  1 |    1 |    3 |
|  2 |    2 |    4 |
|  4 |    1 |    3 |
|  5 |    2 |    4 |
|  7 |    1 |    3 |
|  8 |    2 |    4 |
+----+------+------+

Somehow id 3 and 6 are missing.

I retrieve the max id at some point and add the numbers of rows to it to have the new max id, which differs by 1. Just adding one is unproblematic if that behavior is expected, but I could not find anything in the docs.

Can someone shed some light on this behavior or explain where the example code below is causing this issue? Thanks.

Environment:

Python: 3.9.1
Mariadb: 15.1
Aqlalchemy: 1.4.23)
Pandas: 1.4.1

Code:

(A) Database table creation:

create table if not exists test_table (id INT auto_increment primary key, a int, b int);

(B) Python code:

import pandas as pd
import sqlalchemy as sqla

engine = sqla.create_engine(
    'mariadb+mariadbconnector://user:[email protected]:3306/unit_test',
    echo=True,
    future=True
)

df = pd.DataFrame({'a' : [1,2], 'b': [3,4]})

_ = df.to_sql(
    'test_table',
    con=engine,
    if_exists='append',
    index=False,
    chunksize=500,
    dtype={
        'a': sqla.Integer,
        'b': sqla.Integer
    }

)

select auto_increment from information_schema.tables where table_name='test_table';

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

列表为空，暂无数据

关于作者

指尖上的星空

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

Pandas DataFrame.to_sql 消耗/省略一个额外的自动生成的 id

环境：

代码：

Environment:

Code:

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

关于作者

相关话题

热门标签

推荐作者

卷耳

佚名

℉服软

qq_2gSKZM

凉宸

gyhjy

友情链接

Pandas DataFrame.to_sql 消耗/省略一个额外的自动生成的 id

环境：

代码：

Environment:

Code:

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

关于作者

相关话题

热门标签

推荐作者

卷耳

佚名

℉服软

qq_2gSKZM

凉宸

gyhjy

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。