SAXParser 中的字符转换
我有一个问题……一个非常特殊的问题,请您指导一下。
原消息:Kevätsunnuntaisin lentää
数据流向是 HttpConnector -> WSDL连接器-> 4b
以下是前 7 个字符的编码
65 76 c3 a4 74 73 75 – 在 Http 连接器中 – 请求 XML 具有 UTF-8 编码
4b 65 76 a3 74 73 75 – 在 WSDL 连接器中 –
InputSource inputSource = new InputSource(myInputStream);
inputSource.setEncoding("UTF-8");
parser.parse(inputSource);
原始字符串转换为 Kev£tsunnuntaisin lent££。此外,还会丢失一个字节。
你能指导我哪里出错了吗? 我必须做什么才能避免这种字符转换?
感谢您的帮助!!!
I have a problem … a very peculiar one could you please guide.
Original message: Kevätsunnuntaisin lentää
The flow of data is HttpConnector -> WSDLConnector -> to the underlying system
The following is the encoding of the first 7 characters
4b 65 76 c3 a4 74 73 75 – In Http Connector – the request XML has UTF-8 encoding
4b 65 76 a3 74 73 75 – in WSDL Connector -
InputSource inputSource = new InputSource(myInputStream);
inputSource.setEncoding("UTF-8");
parser.parse(inputSource);
The original string gets converted to Kev£tsunnuntaisin lent££.Also, there is a loss of a byte.
Could you please guide me where I am going wrong? What must I do to avoid this character conversion?
Thanks for your help!!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这很简单:myInputStream 中的数据未编码为 UTF-8,因此解码失败。
我的猜测是,您将 HTML 连接器的输出保存为字符串,然后将其用作 WSDL 连接器的输入。 在字符串中,数据是unicode,而不是UTF-8。 使用
String.getBytes('UTF-8')
获取具有正确编码的字节数组。对于所有编码问题:始终告诉计算机它应该使用哪种编码,而不是希望它能正确猜测。 字节没有编码,计算机也没有心灵感应:)我希望它永远不会......
This is very simple: The data in myInputStream is not encoded as UTF-8, hence the decoding fails.
My guess is that you save the output of the HTML connector as a string and then use that as the input for the WSDL connector. In the string, the data is unicode, not UTF-8. Use
String.getBytes('UTF-8')
to get an array of bytes with the correct encoding.As for all encoding issues: Always tell the computer with which encoding it should work instead of hoping that it will guess correctly. Bytes have no encoding and the computer is not telepathic :) And I hope it never will be ...