sqlalchemy UnicodeDecodeError: 'utf8'尝试从 msssql 2005 表中选择全部时,编解码器无法解码字节 0xe7

发布于 2024-12-22 19:54:47 字数 3436 浏览 3 评论 0原文

我正在使用 mssql 2005 构建一个与另一个应用程序的数据库配合使用的应用程序(我无法更改它或更改现有的表定义)。 w mssql表排序规则是:“hebrew bin”,应用程序从表中显示完美的希伯来语,所有py文件都编码为utf-8

注意!使用 mssql 将 unicode 希伯来语字符串写入数据库没有问题。 选择和删除没有问题:DBSession2.query(object).filter(object.LOADED=='Y').delete() 但是当从表中进行选择时,我遇到了这个非常烦人的错误:

  File "D:\Python27\learn\agent\agent\lib\encodings\utf_8.py", line 16, in decode     return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 0: invalid continuation byte

确切的字节代码根据表中第一行的第一个字节而变化。

是的,我知道这会翻译成希伯来字母 - 这应该不是问题,因为各方都使用 unicode - 至少我是这么认为的。

顺便说一句 - 这在测试 mssql 2005 服务器上运行良好,但不适用于生产服务器。

一些代码: 这是函数的失败部分:

def iterateJson(parser,injson,object):
    '''iterateJson(parser,injson,object):getting a parser method an a json and iterating over the json
    with the parser method, checkes for existing objects in the db table and deletes them before commiting the new one to
    prevent integerityerrors
    writes ValidateJsonError to errorlog for each element in the json
    getting an onject name to check for loaded etc'''

    #first lets erase the table from loaded objects
    DBSession2.query(object).filter(object.LOADED=='Y').delete()
    print "finished deleting loaded"
    #now lets get a list from the table of loaded id
    raw_list = DBSession2.query(object).all() #the failing part!
    print "getting raw list of unloaded" #doesn't get here!
    if object == Activities:
        id_list = [e.EVENTCODE for e in raw_list]
        id = e.EVENTCODE

这是 sqlalchemy 类的一部分:

class Deposit(DeclarativeBase2):
    __tablename__ = 'NOAR_LOADDEPOSIT'
    #LINE = Column(INT(8)) 
    RECDEBNUM = Column(NVARCHAR(9) , primary_key=True)
    CURDATE = Column(BIGINT, nullable=False, default=text(u'((0))')) 
    PAYTYPE = Column(CHAR(1), nullable=False, default=text(u"('')")) 
    BANKCODE = Column(NVARCHAR(8), nullable=False, default=text(u"('')")) 
    CUSTACCNAME = Column(NVARCHAR(16), nullable=False, default=text(u"('')")) 
    PAGENUM = Column(NVARCHAR(5), nullable=False, default=text(u"('')"))
    RECNUM = Column(NVARCHAR(2), nullable=False, default=text(u"('')")) 
    RECDATE = Column(BIGINT, nullable=False, default=text(u'((0))')) 
    FIXNUM = Column(NCHAR(1), nullable=False, default=text(u"('')")) 
    EVENTNUM = Column(NVARCHAR(5), nullable=False, default=text(u"('')")) 
    GROUPCODE = Column(NVARCHAR(7), nullable=False, default=text(u"('')")) 
    IDNUMBER = Column(NVARCHAR(9), nullable=False, default=text(u"('')")) 

和另一个类(都给出相同的问题)

class Activities(DeclarativeBase2):  


    __tablename__ = 'NOAR_LOADEVENTS'

    EVENTCODE = Column(NVARCHAR(8), primary_key=True)
    EVENTDES = Column(Unicode, nullable=False, default=text(u"('')"))
    TYPE = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LC = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LD = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LE = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LF = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LG = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LH = Column(NCHAR(1), nullable=False, default=text(u"('')"))

使用:python 2.7(64位win)与 pyodbc 2.1.11 vs mssql server 2005,sqlalchemy 0.7.3 tg2.1.3

很高兴获得任何帮助或参考

I'm building an application that works with the db of another application, using mssql 2005 (I can't change that or change the existing table definition). w
the mssql table collate is: "hebrew bin", and the application shows perfect hebrew from the table, all the py files are encoded utf-8

notice! there is no problem writing with unicode hebrew strings to the db with mssql.
there is no trouble choosing and deleting: DBSession2.query(object).filter(object.LOADED=='Y').delete()
but when selecting from the table I get this very annoying error:

  File "D:\Python27\learn\agent\agent\lib\encodings\utf_8.py", line 16, in decode     return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 0: invalid continuation byte

where the exact byte code changes according to the first byte of the first row in the table.

yes, I know that this translates to an hebrew letter - this shouldn't be a problem since all sides play with unicode - at least that is what I thought.

btw - this worked fine on the test mssql 2005 server but doesn't work with the production server.

some code:
this is the failing part of the function:

def iterateJson(parser,injson,object):
    '''iterateJson(parser,injson,object):getting a parser method an a json and iterating over the json
    with the parser method, checkes for existing objects in the db table and deletes them before commiting the new one to
    prevent integerityerrors
    writes ValidateJsonError to errorlog for each element in the json
    getting an onject name to check for loaded etc'''

    #first lets erase the table from loaded objects
    DBSession2.query(object).filter(object.LOADED=='Y').delete()
    print "finished deleting loaded"
    #now lets get a list from the table of loaded id
    raw_list = DBSession2.query(object).all() #the failing part!
    print "getting raw list of unloaded" #doesn't get here!
    if object == Activities:
        id_list = [e.EVENTCODE for e in raw_list]
        id = e.EVENTCODE

this is part of the sqlalchemy class:

class Deposit(DeclarativeBase2):
    __tablename__ = 'NOAR_LOADDEPOSIT'
    #LINE = Column(INT(8)) 
    RECDEBNUM = Column(NVARCHAR(9) , primary_key=True)
    CURDATE = Column(BIGINT, nullable=False, default=text(u'((0))')) 
    PAYTYPE = Column(CHAR(1), nullable=False, default=text(u"('')")) 
    BANKCODE = Column(NVARCHAR(8), nullable=False, default=text(u"('')")) 
    CUSTACCNAME = Column(NVARCHAR(16), nullable=False, default=text(u"('')")) 
    PAGENUM = Column(NVARCHAR(5), nullable=False, default=text(u"('')"))
    RECNUM = Column(NVARCHAR(2), nullable=False, default=text(u"('')")) 
    RECDATE = Column(BIGINT, nullable=False, default=text(u'((0))')) 
    FIXNUM = Column(NCHAR(1), nullable=False, default=text(u"('')")) 
    EVENTNUM = Column(NVARCHAR(5), nullable=False, default=text(u"('')")) 
    GROUPCODE = Column(NVARCHAR(7), nullable=False, default=text(u"('')")) 
    IDNUMBER = Column(NVARCHAR(9), nullable=False, default=text(u"('')")) 

and the other class (both give the same problem)

class Activities(DeclarativeBase2):  


    __tablename__ = 'NOAR_LOADEVENTS'

    EVENTCODE = Column(NVARCHAR(8), primary_key=True)
    EVENTDES = Column(Unicode, nullable=False, default=text(u"('')"))
    TYPE = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LC = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LD = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LE = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LF = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LG = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LH = Column(NCHAR(1), nullable=False, default=text(u"('')"))

using: python 2.7 (64bit win) with pyodbc 2.1.11 vs mssql server 2005, sqlalchemy 0.7.3 tg2.1.3

be glad for any help or reference

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

聆听风音 2024-12-29 19:54:47

根据您的说法,数据库表中的文本列很可能是用 cp1255 (Windows 希伯来语)编码的,但您的代码需要 UTF-8,也许是默认情况下被明确告知有关编码的任何信息。

您需要找到对 sqlalchemypyodbc 的调用需要修改以纠正错误信息。

Based on what you say, it is highly likely that text columns in the database table are encoded in cp1255 (Windows Hebrew) but your code is expecting UTF-8, perhaps in default of being explicitly told anything about the encoding.

You need to find what call into sqlalchemy or pyodbc needs to modified to correct the misinformation.

清君侧 2024-12-29 19:54:47

找到了答案——经典的wtf。似乎生产服务器的配置与测试服务器的配置略有不同(尽管两者应该是相同的) - 所以我的应用程序模型期望它得到 varchar 而不是 nvarchar。感谢您的帮助。无论如何,写下问题帮助我理清了思路

found the answer - a classic wtf. seems the the production server had a slightly different configuration then the test server (although both were supposed to be the same) - so instead of nvarchar my application model was expecting it got varchar. thanks for the help. writing the question helped me clear my thoughts anyway

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文