使用python将大文本数据插入postgres

发布于 2024-12-12 22:07:39 字数 1879 浏览 2 评论 0原文

我正在尝试将长 xml 字符串作为文本批量插入到 postgresql 9.1 数据库中。我正在使用 Python 3.2 和 pyscopg2。我将 xml 字符串括在 $$ 中,并在查询字符串中使用命名变量。例如:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
         $$%(objMetaString)s$$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"

然后我按如下方式构造一个字典对象:

dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}

传入 objIdent 和 objMetaString 值。我使用以下代码进行插入:

dbCursor.execute(query, dataDict)

当它将 objMetaString 值插入数据库时​​,它在字符串周围包含单引号。如果我将值附加到查询字符串中并在没有命名变量的情况下执行插入,则不会。例如:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %s), \
         $$%s$$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))" % (objIdent, objMetaString)

和插入:

dbCursor.execute(query)

我的问题是如何使用命名变量和 $$ 批量插入大型文本数据。如果可能的话,我真的不想预处理或后处理该字符串,因为它们可能很大并且包含未知数量的单引号或需要分隔的其他符号。 我已阅读以下文档并在 stackoverflow 上搜索答案,但尚未找到解决方案:

I am trying to do a bulk insert of long xml strings as text into a postgresql 9.1 database. I am using Python 3.2 and pyscopg2. I am enclosing the xml string in $$ and using a named variable in the query string. For example:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
         $%(objMetaString)s$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"

I then construct a dictionary object as follows:

dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}

passing in the objIdent and objMetaString values. I do the insert with the following code:

dbCursor.execute(query, dataDict)

When it inserts the objMetaString value into the database it contains single quotes around the string. If I append the values into the query string and execute the insert without the named variable it does not. For example:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %s), \
         $%s$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))" % (objIdent, objMetaString)

and the insert:

dbCursor.execute(query)

My question is how to do a bulk insert of large text data using named variables and $$. I don't really want to have to either pre or post process this string if possible since they may be large and contain an unknown number of either single quotes, or other symbols that will need to be delimited.
I have read the following documentation and searched on stackoverflow for the answer, but have not found the solution:

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

小…楫夜泊 2024-12-19 22:07:39

总结一下评论区。执行以下操作:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
         %(objMetaString)s, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"

dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}

dbCursor.execute(query, dataDict)

不要在查询中的 %(objMetaString)s 占位符两边加引号。如有必要,驱动程序的工作就是引用该值。

Summarizing the comment thread. Do this:

query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
         cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
         %(objMetaString)s, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
         (select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"

dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}

dbCursor.execute(query, dataDict)

Don't put quotes around the %(objMetaString)s placeholder in your query. It's the driver's job to quote the value if necessary.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文