将unicode插入sqlite?

发布于 2025-01-04 13:09:30 字数 937 浏览 0 评论 0原文

我仍在学习 Python,作为一个小项目,我编写了一个脚本,它将获取文本文件中的值并将它们插入到 sqlite3 数据库中。但有些名称有奇怪的字母(我猜你会称它们为非 ASCII),并且当它们出现时会生成错误。这是我的小脚本(请告诉我是否有更 Pythonic 的脚本): import sqlite3

f = open('complete', 'r')
fList = f.readlines()
conn = sqlite3.connect('tpb')
cur = conn.cursor()

for i in fList:
    exploaded = i.split('|')
    eList = (
        (exploaded[1], exploaded[5])
    )
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
    conn.commit()
cur.close()

它会生成此错误:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\sortinghat.py", line 13, in <module>
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te
xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is
highly recommended that you instead just switch your application to Unicode str
ings.

I am still learning Python and as a little Project I wrote a script that would take the values I have in a text file and insert them into a sqlite3 database. But some of the names have weird letter (I guess you would call them non-ASCII), and generate an error when they come up. Here is my little script (and please tell me if there is anyway it could be more Pythonic):
import sqlite3

f = open('complete', 'r')
fList = f.readlines()
conn = sqlite3.connect('tpb')
cur = conn.cursor()

for i in fList:
    exploaded = i.split('|')
    eList = (
        (exploaded[1], exploaded[5])
    )
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
    conn.commit()
cur.close()

And it generates this error:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\sortinghat.py", line 13, in <module>
    cur.execute('INSERT INTO magnets VALUES(?, ?)', eList)
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te
xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is
highly recommended that you instead just switch your application to Unicode str
ings.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

萌逼全场 2025-01-11 13:09:30

要将文件内容转换为 unicode,您需要从其所在的编码进行解码。
您似乎使用的是 Windows,因此 cp1252 是一个不错的选择。
如果您从其他地方获得该文件,那么一切就都失败了。

对编码进行排序后,一种简单的解码方法是使用 codecs 模块,例如:

import codecs
# ...
with codecs.open('complete', encoding='cp1252') as fin: # or utf-8 or whatever
  for line in fin:
    to_insert = (line.split('|')[1], line.split('|')[5])
    cur.execute('INSERT INTO magnets VALUES (?,?)', to_insert)
    conn.commit()
# ...

To get the file contents into unicode you need to decode from whichever encoding it is in.
It looks like you're on Windows so a good bet is cp1252.
If you got the file from somewhere else all bets are off.

Once you have the encoding sorted, an easy way to decode is to use the codecs module, e.g.:

import codecs
# ...
with codecs.open('complete', encoding='cp1252') as fin: # or utf-8 or whatever
  for line in fin:
    to_insert = (line.split('|')[1], line.split('|')[5])
    cur.execute('INSERT INTO magnets VALUES (?,?)', to_insert)
    conn.commit()
# ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文