使用python将大文本数据插入postgres
我正在尝试将长 xml 字符串作为文本批量插入到 postgresql 9.1 数据库中。我正在使用 Python 3.2 和 pyscopg2。我将 xml 字符串括在 $$ 中,并在查询字符串中使用命名变量。例如:
query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
$$%(objMetaString)s$$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
(select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"
然后我按如下方式构造一个字典对象:
dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}
传入 objIdent 和 objMetaString 值。我使用以下代码进行插入:
dbCursor.execute(query, dataDict)
当它将 objMetaString 值插入数据库时,它在字符串周围包含单引号。如果我将值附加到查询字符串中并在没有命名变量的情况下执行插入,则不会。例如:
query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %s), \
$$%s$$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
(select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))" % (objIdent, objMetaString)
和插入:
dbCursor.execute(query)
我的问题是如何使用命名变量和 $$ 批量插入大型文本数据。如果可能的话,我真的不想预处理或后处理该字符串,因为它们可能很大并且包含未知数量的单引号或需要分隔的其他符号。 我已阅读以下文档并在 stackoverflow 上搜索答案,但尚未找到解决方案:
I am trying to do a bulk insert of long xml strings as text into a postgresql 9.1 database. I am using Python 3.2 and pyscopg2. I am enclosing the xml string in $$ and using a named variable in the query string. For example:
query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %(objIdent)s), \
$%(objMetaString)s$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
(select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))"
I then construct a dictionary object as follows:
dataDict = {'objIdent':objIdent, 'objMetaString':objMetaString}
passing in the objIdent and objMetaString values. I do the insert with the following code:
dbCursor.execute(query, dataDict)
When it inserts the objMetaString value into the database it contains single quotes around the string. If I append the values into the query string and execute the insert without the named variable it does not. For example:
query = "insert into cms_object_metadata (cms_object_id, cms_object_metadata_data, cms_object_metadata_type_id, \
cms_object_metadata_status_id) values ((select id from cms_objects where cms_object_ident = %s), \
$%s$, (select id from cms_object_metadata_types where cms_object_metadata_type_name = 'PDAT'), \
(select id from cms_object_metadata_status where cms_object_metadata_status_name = 'active'))" % (objIdent, objMetaString)
and the insert:
dbCursor.execute(query)
My question is how to do a bulk insert of large text data using named variables and $$. I don't really want to have to either pre or post process this string if possible since they may be large and contain an unknown number of either single quotes, or other symbols that will need to be delimited.
I have read the following documentation and searched on stackoverflow for the answer, but have not found the solution:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
总结一下评论区。执行以下操作:
不要在查询中的
%(objMetaString)s
占位符两边加引号。如有必要,驱动程序的工作就是引用该值。Summarizing the comment thread. Do this:
Don't put quotes around the
%(objMetaString)s
placeholder in your query. It's the driver's job to quote the value if necessary.