获取字符串的编码
我从其他页面获取 html,但存在编码问题。 例如: 我得到:
aparelho nas sa??das
原文是:
aparelho nas saídas
如何获取编码并转换为原始字符串?
我的代码:
var GetResponse = API_GET("..."); //this returns html of an http request.
HtmlDocument doc = new HtmlDocument(); //the html-parsing
doc.LoadHtml(GetResponse);
var body = doc.DocumentNode.SelectNodes("//div[@class='para']");
...
var para = body[i].InnerHtml; //Here's the problem,it returns the output like: sa??das
我该怎么做?
提前致谢
I'm getting the html from other page with problems in codification.
For example:
I getting:
aparelho nas sa??das
the original text is:
aparelho nas saídas
How I get the encoding and convert to original string?
My code:
var GetResponse = API_GET("..."); //this returns html of an http request.
HtmlDocument doc = new HtmlDocument(); //the html-parsing
doc.LoadHtml(GetResponse);
var body = doc.DocumentNode.SelectNodes("//div[@class='para']");
...
var para = body[i].InnerHtml; //Here's the problem,it returns the output like: sa??das
How I do this?
Thanks in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用这个——有点像这样做。仅检查 utf-8。我认为检查编码很困难。
http://utf8checker.codeplex.com/releases/view/40052
这是部分源代码。查看 IsUtf8 方法。可能非常有用。
Use this - kind of does this. Checks for just utf-8. I think it's hard to check for encoding.
http://utf8checker.codeplex.com/releases/view/40052
Here's part of the source code. Look at the IsUtf8 Methods. Can be quite useful.
尝试使用
HtmlDocument.Load(string path,Encoding 编码)
方法。阅读此帖子了解更多信息。Try to use
HtmlDocument.Load(string path,Encoding encoding)
method. Read this post for more info.