丢弃服务器 XML 响应中返回的不可打印字符
在尝试使用 Bing API 进行搜索时,我收到不可打印的字符,并且似乎不包含任何额外信息。目标是将 XML (UTF-8) 响应保存为文本文件以供稍后解析。
我的代码目前看起来像这样:
URL url = new URL(queryURL);
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
BufferedWriter out = new BufferedWriter(new FileWriter(query+"-"+saveResultAs));
String str = in.readLine();
out.write(str);
in.close();
out.close();
当我将 'str' 的内容发送到控制台时,它看起来像这样:
这是新创建的本地 XML 文件的样子:
我应该做什么来转换 UTF- 8 文本使str 没有多余的字符?
While trying to use the Bing API to search, I am getting characters that are not printable and do not seem to hold any extra information. The goal is to save the XML (UTF-8) response as a text file to be parsed later.
My code currently looks something like this:
URL url = new URL(queryURL);
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
BufferedWriter out = new BufferedWriter(new FileWriter(query+"-"+saveResultAs));
String str = in.readLine();
out.write(str);
in.close();
out.close();
When I send the contents of 'str' to console it looks something like this:
and here's a what the newly created local XML file looks like:
What should I be doing to convert the UTF-8 text so that str does not have the extra characters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您预先知道编码,
则应该与编写器相同...在您的示例中,编写文件后会使用平台默认值进行编码,同时仍声明为 UTF-8。
从 XML 声明中读取编码以避免出现意外可能是明智之举。
如果您只想存储数据以供以后使用,则无论如何编码/解码都是没有用的。只需读取字节并将它们写掉即可。保留检测 XML 解析器编码的任务。
If you know upfront the encoding you should
And the same with the writer... in your example after writing your file is encoded in platform default, while still declaring to be UTF-8.
It may be wise to read the encoding from the XML declaration to avoid surprises.
If you only want to store the data for later use there's no use to encode/decode anyway. Just read the bytes and write them away. Keep the task of detecting encoding for the XML parser..
XML 解析器将处理编码/解码,并且适当的字符将反馈给您(例如,SAX 解析器将通过
characters()
方法回调执行此操作)。您需要做的就是将其存储在合适的文件中(也许使用合适的 Byte-Order-Mark?)The XML parser will handle encoding/decoding, and the appropriate characters will be fed back to you (e.g. a SAX parser will do this via the
characters()
method callback). All you need to do is then store that in a suitable file (perhaps with a suitable Byte-Order-Mark?)