我可以关闭隐式 Python unicode 转换来查找混合字符串错误吗?
在分析我们的代码时,我惊讶地发现有数百万次调用
C:\Python26\lib\encodings\utf_8.py:15(decode)
我开始调试,发现我们的代码库中有很多小错误,通常是将字符串与 unicode 进行比较,或者添加一个字符串和一个 unicode。 Python 会优雅地解码字符串并以 unicode 执行以下操作。
多么亲切啊。但很贵!
我对 unicode 很流利,阅读了 Joel Spolsky 和 深入了解 Python...
我尝试仅将代码内部保持在 unicode 中。
我的问题 - 我可以关闭这种Python式的好人行为吗?至少在我找到所有这些错误并修复它们之前(通常通过添加 u'u')?
其中一些非常难以找到(有时是字符串的变量......)。
Python 2.6.5(我无法切换到3.x)。
When profiling our code I was surprised to find millions of calls to
C:\Python26\lib\encodings\utf_8.py:15(decode)
I started debugging and found that across our code base there are many small bugs, usually comparing a string to a unicode or adding a sting and a unicode. Python graciously decodes the strings and performs the following operations in unicode.
How kind. But expensive!
I am fluent in unicode, having read Joel Spolsky and Dive Into Python...
I try to keep our code internals in unicode only.
My question - can I turn off this pythonic nice-guy behavior? At least until I find all these bugs and fix them (usually by adding a u'u')?
Some of them are extremely hard to find (a variable that is sometimes a string...).
Python 2.6.5 (and I can't switch to 3.x).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
以下内容应该有效:
上面代码片段中的
reload(sys)
仅在此处才需要,因为通常sys.setdefaultencoding
应该位于sitecustomize.py 文件位于 Python
site-packages
目录中(建议这样做)。The following should work:
reload(sys)
in the snippet above is only necessary here since normallysys.setdefaultencoding
is supposed to go in asitecustomize.py
file in your Pythonsite-packages
directory (it's advisable to do that).