特殊字符显示为问号哈希

发布于 2024-12-11 22:21:05 字数 962 浏览 1 评论 0原文

我正在为 Android 设备开发应用程序，最近在开发时遇到了问题。

我需要从在线 html 文件中获取信息，因此我构建了 InputStream 和 BufferedReader 来实际扫描文件以获取信息。我分割了字符串以实际获取我的信息，并尝试用吐司来显示它。

一切都按我想要的方式工作正常，但每次应显示特殊字符时，都会显示问号哈希。

我认为这可能是字符集的问题，因为网站上说：

<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">

How to I get this right?

编辑：

HttpClient httpClient = new DefaultHttpClient();
HttpPost post = new HttpPost(url);
((AbstractHttpClient) httpClient).getCredentialsProvider().setCredentials(new AuthScope(null, -1), new UsernamePasswordCredentials("user","password"));
HttpResponse response;
response = httpClient.execute(post);
BufferedReader reader = new BufferedReader(
    new InputStreamReader(
        response.getEntity().getContent()
    )
);
String line = null;
while ((line = reader.readLine()) != null) {
    Toast.makeText(this, line, Toast.LENGTH_LONG).show();
}

原文

I'm developing applications for android devices and had a problem while developing lately.

I needed to get information out of an html-file online, so I made a construct of InputStream and BufferedReader to actually scan the file for information. I splitted my string to actually get my information and tried displaying it with a toast.

Everything works fine and the way I want it to, but everytime a special-characters should be displayed, a questionmark-hash is.

I think it might be a problem of the charset, because the website say in the :

<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">

How to I get this right?

EDIT :

HttpClient httpClient = new DefaultHttpClient();
HttpPost post = new HttpPost(url);
((AbstractHttpClient) httpClient).getCredentialsProvider().setCredentials(new AuthScope(null, -1), new UsernamePasswordCredentials("user","password"));
HttpResponse response;
response = httpClient.execute(post);
BufferedReader reader = new BufferedReader(
    new InputStreamReader(
        response.getEntity().getContent()
    )
);
String line = null;
while ((line = reader.readLine()) != null) {
    Toast.makeText(this, line, Toast.LENGTH_LONG).show();
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

七色彩虹 2024-12-18 22:21:05

InputStreamReader 实际上可能会将 Charset 作为第二个参数，我认为它指示它将要读取的流的字符编码。符合标准的 Java 实现不需要采用 windows-1252 编码，但我相信它与 ISO-8859-1 非常相似，您可以首先尝试一下解决方法看看它是否有效。 InputStreamReader 类中还有另一个可能有趣的构造函数，它采用 CharsetDecoder 作为第二个参数（您可以通过调用 Charset.newDecoder 创建一个），您可以尝试使用它以您喜欢的编码或系统默认编码（可以通过调用 Charset.defaultCharset 获得）来解码流。

请参阅 InputStreamReader 的 JavaDoc API 文档，< a href="http://download.oracle.com/javase/6/docs/api/java/nio/charset/Charset.html" rel="nofollow">字符集和 CharsetDecoder 了解详细信息。事实上，我不是专家，我对编码及其问题知之甚少，但我认为值得指出这些类的可用性。

您还可以通过调用 getEncoding 方法来检查用于 InputStreamReader 的编码。

回复收藏 0 原文

べ繥欢鉨o。 2024-12-18 22:21:05

我的猜测是，您刚刚使用了 InputStreamReader 构造函数，它接受流而不是字符编码 - 因此它将尝试使用平台默认值。您应该使用响应中指定的编码；当您使用 HTTP 时，Content-Type 标头中的内容可能没问题，但遗憾的是 HTML 可以单独指定它:(

现在 Android 是否包含 Windows-1252 编码是另一件事...

回复收藏 0 原文

扎心 2024-12-18 22:21:05

哦，无论这个问题是否在其他地方得到解决，请使用utf-8。
http://www.w3.org/TR/html4/charset.html
http://en.wikipedia.org/wiki/UTF-8

回复收藏 0 原文

喜爱纠缠 2024-12-18 22:21:05

以防万一其他人也遇到与我相同的问题...

我从从 res/raw 加载的 JSON 文件中提取的文本得到了相同的问号-in-a-black-diamond。无论我尝试哪种流阅读组合，字符仍然会出现。我第一次尝试确保使用 UTF-8 是通过 Eclipse 检查文件属性，果然它被设置为“MacRoman”，无论它是什么。我将其更改为UTF-8，构建，运行，失败，清理，构建，运行，失败，抓破头，回到SO。

我读到我必须在更改编码后保存文件，所以我尝试了，但仍然没有成功。然后，我最终在 Eclipse 编辑器中的 JSON 文件中向下滚动到特殊字符所在的位置，有趣的是，特殊字符（é 和破折号）也显示为黑色菱形！我删除并重新输入它们，一切正常。

底线：编码很重要，在创建资源文件（XML、JSON、CSV 或其他文件）时，请确保在开始输入文本之前选择正确的编码（通常是 UTF-8）。

回复收藏 0 原文

~没有更多了~