如何获取网页内容并将其保存到字符串变量中

发布于 2024-10-08 12:15:37 字数 61 浏览 0 评论 0原文

如何使用 ASP.NET 获取网页内容?我需要编写一个程序来获取网页的 HTML 并将其存储到字符串变量中。

How I can get the content of the web page using ASP.NET? I need to write a program to get the HTML of a webpage and store it into a string variable.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

无远思近则忧 2024-10-15 12:15:37

您可以使用 网络客户端

Using System.Net;

using(WebClient client = new WebClient()) {
    string downloadString = client.DownloadString("http://www.gooogle.com");
}

You can use the WebClient

Using System.Net;

using(WebClient client = new WebClient()) {
    string downloadString = client.DownloadString("http://www.gooogle.com");
}
浮光之海 2024-10-15 12:15:37

我之前遇到过 Webclient.Downloadstring 的问题。如果你这样做,你可以尝试这个:

WebRequest request = WebRequest.Create("http://www.google.com");
WebResponse response = request.GetResponse();
Stream data = response.GetResponseStream();
string html = String.Empty;
using (StreamReader sr = new StreamReader(data))
{
    html = sr.ReadToEnd();
}

I've run into issues with Webclient.Downloadstring before. If you do, you can try this:

WebRequest request = WebRequest.Create("http://www.google.com");
WebResponse response = request.GetResponse();
Stream data = response.GetResponseStream();
string html = String.Empty;
using (StreamReader sr = new StreamReader(data))
{
    html = sr.ReadToEnd();
}
吃素的狼 2024-10-15 12:15:37

我建议不要使用WebClient.DownloadString。这是因为(至少在 .NET 3.5 中)DownloadString 不够智能,无法使用/删除 BOM(如果存在)。这可能会导致 BOM (< code>) 在返回 UTF-8 数据时错误地显示为字符串的一部分(至少没有字符集) - 糟糕!

相反,这种细微的变化将在 BOM 中正常工作:

string ReadTextFromUrl(string url) {
    // WebClient is still convenient
    // Assume UTF8, but detect BOM - could also honor response charset I suppose
    using (var client = new WebClient())
    using (var stream = client.OpenRead(url))
    using (var textReader = new StreamReader(stream, Encoding.UTF8, true)) {
        return textReader.ReadToEnd();
    }
}

I recommend not using WebClient.DownloadString. This is because (at least in .NET 3.5) DownloadString is not smart enough to use/remove the BOM, should it be present. This can result in the BOM () incorrectly appearing as part of the string when UTF-8 data is returned (at least without a charset) - ick!

Instead, this slight variation will work correctly with BOMs:

string ReadTextFromUrl(string url) {
    // WebClient is still convenient
    // Assume UTF8, but detect BOM - could also honor response charset I suppose
    using (var client = new WebClient())
    using (var stream = client.OpenRead(url))
    using (var textReader = new StreamReader(stream, Encoding.UTF8, true)) {
        return textReader.ReadToEnd();
    }
}
み格子的夏天 2024-10-15 12:15:37
Webclient client = new Webclient();
string content = client.DownloadString(url);

传递你想要获取的页面的URL。您可以使用 htmlagilitypack 解析结果。

Webclient client = new Webclient();
string content = client.DownloadString(url);

Pass the URL of page who you want to get. You can parse the result using htmlagilitypack.

贵在坚持 2024-10-15 12:15:37

我一直在使用 WebClient,但在发表这篇文章时(.NET 6 可用),WebClient 已被弃用。

首选方式是

HttpClient client = new HttpClient();
string content = await client.GetStringAsync(url);

I have always been using WebClient, but at the time this post is made (.NET 6 is avail), WebClient is getting deprecated.

The preferred way is

HttpClient client = new HttpClient();
string content = await client.GetStringAsync(url);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文