pysqlite插入unicode数据8位字节串错误
我知道以前曾问过这个问题的类似排列,但答案似乎并没有说明我在这里做错了什么。
我正在尝试插入这一行: (Pdb) 打印行 ['886', '39', '83474', '0', '0', '0', '0', '0', '1.00', 'D', '20070813', 'R', ' C'、'B'、“SOCK 4PK”、 '\xe9\x9e\x8b\xe5\xad\x90\xe5\xb0\xba\xe5\xaf\xb86-9.5/24-27.5CM', 'PR']
放入此表中: 创建表项目(“whs”int,“dept”int,“item”int,“dsun”int,“oh”int,“ohrtv”int,“adjp”int,“adjn”int,“销售”文本,“ stat" 文本,"lsldt" int,"cat1" 文本,"cat2" 文本,"cat3" 文本,"des1" 文本,"sgn3"文本,“单位”文本);
sgn3 列似乎引起了问题。定义为TEXT,插入的数据为utf-8。为什么我收到 sqlite3 错误?
编程错误:'除非您使用可以解释 8 位 bytestr...= str 的 text_factory,否则不得使用 8 位字节字符串。强烈建议您将应用程序切换为 Unicode 字符串。
这是执行插入的代码:
query = 'insert into %s values(%s)' % (
self.tablename,
','.join(['?' for field in row])
)
self.con.execute(query, row)
这是创建要插入的记录生成器的过程:
def encode_utf_8(self, csv_data, csv_encoding):
"""Decodes from 'csv_encoding' and encodes to utf-8.
Accepts any open csv file encoding using any scheme recognized by
python. Returns a generator.
"""
for line in csv_data:
try:
yield line.decode(csv_encoding).encode('utf-8')
except UnicodeDecodeError:
next
I know similar permutations of this question have been asked before, but the answers don't seem to shed light on what I am doing wrong here.
I am trying to insert this row:
(Pdb) print row
['886', '39', '83474', '0', '0', '0', '0', '0', '1.00', 'D', '20070813', 'R', 'C', 'B', "SOCK 4PK", '\xe9\x9e\x8b\xe5\xad\x90\xe5\xb0\xba\xe5\xaf\xb86-9.5/24-27.5CM', 'PR']
into this table:
CREATE TABLE item ("whs" int,"dept" int,"item" int,"dsun" int,"oh" int,"ohrtv" int,"adjp" int," adjn" int,"sell" text,"stat" text,"lsldt" int,"cat1" text,"cat2" text,"cat3" text,"des1" text,"sgn3" text,"unit" text);
The sgn3 column seems to causing the problems. It is defined as TEXT, and the data to be inserted is utf-8. Why am I receiving the sqlite3 error?
ProgrammingError: 'You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestr...= str). It is highly recommended that you instead just switch your application to Unicode strings.'
Here is the code doing the insert:
query = 'insert into %s values(%s)' % (
self.tablename,
','.join(['?' for field in row])
)
self.con.execute(query, row)
And here is the procedure that creates the generator of records to be inserted:
def encode_utf_8(self, csv_data, csv_encoding):
"""Decodes from 'csv_encoding' and encodes to utf-8.
Accepts any open csv file encoding using any scheme recognized by
python. Returns a generator.
"""
for line in csv_data:
try:
yield line.decode(csv_encoding).encode('utf-8')
except UnicodeDecodeError:
next
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是我见过的最有帮助的错误消息之一。照它说的做就行了。向其提供
unicode
对象,而不是 UTF-8 编码的str
对象。换句话说,丢失.encode('utf-8')
或者稍后再解码('utf-8') ...到底是什么<代码>csvdata?如果您在现有代码中遇到 UnicodeDecodeError:
(1) 您应该做一些比您打算用它做的事情更有用的事情(将其隐藏起来)
(2) 您可能希望更改
next 到
pass
回复评论
哈哈???我不是在开玩笑;我不是在开玩笑。它准确地告诉你该做什么。
你叫什么“csv 文件”:
完全错误。
bytes_read_from_file.decode('big5')
生成一个unicode
对象。您可能想阅读Python Unicode HOWTO。不,它们已经是
unicode
了。但是,根据csvdata
的内容,您可能需要编码为utf8
以通过 csv 机制获取它们,然后稍后对其进行解码。That is one of the most helpful error messages that I've ever seen. Just do what it says. Feed it
unicode
objects, not UTF-8-encodedstr
objects. In other words, lose the.encode('utf-8')
or maybe follow that later by decode('utf-8') ...what exactly iscsvdata
?If you ever get a UnicodeDecodeError in your existing code:
(1) You should do something much more useful than what you intended to do with it (sweep it under the carpet)
(2) You may wish to change
next
topass
Response to comment
haha??? I wasn't joking; it tells you exactly what to do.
What are you calling "a csv file":
Utterly wrong.
bytes_read_from_file.decode('big5')
produces aunicode
object. You may like to read the Python Unicode HOWTO.No, they are
unicode
already. However depending on whatcsvdata
is, you may want to encode intoutf8
to get them through the csv mechanism and then decode them later.