如何获取法语网站内容
我有一个网站,其内容是法语。
现在我想使用 C# 在控制台应用程序中通过 HttpWebRequest
和 HttpWebResponse
获取这些。
public string GetContents(string url)
{
StreamReader _Answer;
try
{
HttpWebRequest WebReq = (HttpWebRequest)WebRequest.Create(url);
WebReq.Headers.Add(HttpRequestHeader.AcceptEncoding, "utf-8");
WebReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0;Windows NT 5.1;)";
WebReq.ContentType = "application/x-www-form-urlencoded";
HttpWebResponse WebResp = (HttpWebResponse)WebReq.GetResponse();
Stream Answer = WebResp.GetResponseStream();
Encoding encode = System.Text.Encoding.GetEncoding("utf-8");
_Answer = new StreamReader(Answer, Encoding.UTF8);
return _Answer.ReadToEnd();
}
catch
{
}
return "";
}
我得到了内容,但它包含一些奇怪的符号,如正方形等。
I have a site whose content is in French language.
Now I want to get these through HttpWebRequest
and HttpWebResponse
in console application using c#.
public string GetContents(string url)
{
StreamReader _Answer;
try
{
HttpWebRequest WebReq = (HttpWebRequest)WebRequest.Create(url);
WebReq.Headers.Add(HttpRequestHeader.AcceptEncoding, "utf-8");
WebReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0;Windows NT 5.1;)";
WebReq.ContentType = "application/x-www-form-urlencoded";
HttpWebResponse WebResp = (HttpWebResponse)WebReq.GetResponse();
Stream Answer = WebResp.GetResponseStream();
Encoding encode = System.Text.Encoding.GetEncoding("utf-8");
_Answer = new StreamReader(Answer, Encoding.UTF8);
return _Answer.ReadToEnd();
}
catch
{
}
return "";
}
I get the content but it contain some strange symbol like squares etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您确定 Web 服务器正在使用 UTF-8 编码进行响应吗?
更新:
您尝试下载的 Web 服务器正在使用
ISO-8859-1
字符编码而不是UTF-8
提供页面。您必须 (a) 更改硬编码内容类型或 (b) 从服务器响应中读取内容类型并使用它。
Are you sure the web server is responding with UTF-8 encoding?
Update:
The web server from which you are trying to download is serving the pages with a character encoding of
ISO-8859-1
and notUTF-8
.You have to (a) change your hard coded content type or (b) read the content type from the server response and use that.