丢弃服务器 XML 响应中返回的不可打印字符

发布于 2024-10-10 17:59:27 字数 667 浏览 3 评论 0原文

在尝试使用 Bing API 进行搜索时,我收到不可打印的字符,并且似乎不包含任何额外信息。目标是将 XML (UTF-8) 响应保存为文本文件以供稍后解析。

我的代码目前看起来像这样:

    URL url = new URL(queryURL);

    BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
    BufferedWriter out = new BufferedWriter(new FileWriter(query+"-"+saveResultAs));
    String str = in.readLine();
    out.write(str);

    in.close();
    out.close();

当我将 'str' 的内容发送到控制台时,它看起来像这样:

alt text

这是新创建的本地 XML 文件的样子:

alt text

我应该做什么来转换 UTF- 8 文本使str 没有多余的字符?

While trying to use the Bing API to search, I am getting characters that are not printable and do not seem to hold any extra information. The goal is to save the XML (UTF-8) response as a text file to be parsed later.

My code currently looks something like this:

    URL url = new URL(queryURL);

    BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
    BufferedWriter out = new BufferedWriter(new FileWriter(query+"-"+saveResultAs));
    String str = in.readLine();
    out.write(str);

    in.close();
    out.close();

When I send the contents of 'str' to console it looks something like this:

alt text

and here's a what the newly created local XML file looks like:

alt text

What should I be doing to convert the UTF-8 text so that str does not have the extra characters?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

能否归途做我良人 2024-10-17 17:59:27

如果您预先知道编码,

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));

则应该与编写器相同...在您的示例中,编写文件后会使用平台默认值进行编码,同时仍声明为 UTF-8。

从 XML 声明中读取编码以避免出现意外可能是明智之举。

如果您只想存储数据以供以后使用,则无论如何编码/解码都是没有用的。只需读取字节并将它们写掉即可。保留检测 XML 解析器编码的任务。

If you know upfront the encoding you should

BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));

And the same with the writer... in your example after writing your file is encoded in platform default, while still declaring to be UTF-8.

It may be wise to read the encoding from the XML declaration to avoid surprises.

If you only want to store the data for later use there's no use to encode/decode anyway. Just read the bytes and write them away. Keep the task of detecting encoding for the XML parser..

笑红尘 2024-10-17 17:59:27

XML 解析器将处理编码/解码,并且适当的字符将反馈给您(例如,SAX 解析器将通过 characters() 方法回调执行此操作)。您需要做的就是将其存储在合适的文件中(也许使用合适的 Byte-Order-Mark?)

The XML parser will handle encoding/decoding, and the appropriate characters will be fed back to you (e.g. a SAX parser will do this via the characters() method callback). All you need to do is then store that in a suitable file (perhaps with a suitable Byte-Order-Mark?)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文