将其写入文件时 XML 编码错误
我认为我遵循了正确的方法,但仍然遇到编码错误:
from xml.dom.minidom import Document
import codecs
doc = Document()
wml = doc.createElement("wml")
doc.appendChild(wml)
property = doc.createElement("property")
wml.appendChild(property)
descriptionNode = doc.createElement("description")
property.appendChild(descriptionNode)
descriptionText = doc.createTextNode(description.decode('ISO-8859-1'))
descriptionNode.appendChild(descriptionText)
file = codecs.open('contentFinal.xml', 'w', encoding='ISO-8859-1')
file.write(doc.toprettyxml())
file.close()
描述节点包含 ISO-8859-1 编码
中的一些字符,这是由站点本身在元标记中指定的编码。但是当 doc.toprettyxml() 开始写入文件时,我收到以下错误:
Traceback (most recent call last):
File "main.py", line 467, in <module>
file.write(doc.toprettyxml())
File "C:\Python27\lib\xml\dom\minidom.py", line 60, in toprettyxml
return writer.getvalue()
File "C:\Python27\lib\StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 10: ordinal not in range(128)
为什么当我使用相同的标准进行解码和编码时会收到此错误?
已编辑
我的脚本文件中有以下减速:
#!/usr/bin/python
# -*- coding: utf-8 -*-
这可能是冲突的吗?
I think I am following the right approach but I am still getting an encoding error:
from xml.dom.minidom import Document
import codecs
doc = Document()
wml = doc.createElement("wml")
doc.appendChild(wml)
property = doc.createElement("property")
wml.appendChild(property)
descriptionNode = doc.createElement("description")
property.appendChild(descriptionNode)
descriptionText = doc.createTextNode(description.decode('ISO-8859-1'))
descriptionNode.appendChild(descriptionText)
file = codecs.open('contentFinal.xml', 'w', encoding='ISO-8859-1')
file.write(doc.toprettyxml())
file.close()
The description node contains some characters in ISO-8859-1 encoding
, this is encoding specified by the site it self in meta tag. But when doc.toprettyxml()
starts writing in file I got following error:
Traceback (most recent call last):
File "main.py", line 467, in <module>
file.write(doc.toprettyxml())
File "C:\Python27\lib\xml\dom\minidom.py", line 60, in toprettyxml
return writer.getvalue()
File "C:\Python27\lib\StringIO.py", line 271, in getvalue
self.buf += ''.join(self.buflist)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 10: ordinal not in range(128)
Why am I getting this error as I am decoding and encoding with same standard?
Edited
I have following deceleration in my script file:
#!/usr/bin/python
# -*- coding: utf-8 -*-
may be this is conflicting?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
好的,我找到了解决方案。当数据采用其他外语时,您只需在 xml 标头中定义正确的编码即可。您不需要在
file.write(doc.toprettyxml(encoding='ISO-8859-1'))
中描述编码,即使您打开文件用于写入file = codecs 也是如此.open('contentFinal.xml', 'w', 编码='ISO-8859-1')
.下面是我使用的技术。可能这不是专业方法,但对我有用。可能有一种方法可以在标头中设置默认编码,但我找不到它。
上述方法不会给浏览器带来任何错误,所有数据显示完美。
Ok i have found a solution. When ever data is in other foriegn language you just need to defined the proper encoding in xml header. You do not need to describe encoding in
file.write(doc.toprettyxml(encoding='ISO-8859-1'))
not even when you are opening a file for writingfile = codecs.open('contentFinal.xml', 'w', encoding='ISO-8859-1')
. Below is the technique which i used. May be This is not a professional method but that works for me.May be there is a method to set default encoding in header but i could not find it.
Above method does not bring any error on browser and all data display perfectly.