matweb.com:如何获取页面源?

发布于 2024-10-08 16:40:08 字数 1399 浏览 4 评论 0原文

我的网址如下:
http://www.matweb.com/search/DataSheet.aspx?MatGUID=849e2916ab1541be9ff6a17b78f95c82

我想使用此代码从该页面下载源代码:

private static string urlTemplate = @"http://www.matweb.com/search/DataSheet.aspx?MatGUID=";

static string GetSource(string guid)
{
    try
    {
        Uri url = new Uri(urlTemplate + guid);

        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
        webRequest.Method = "GET";               

        HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();

        Stream responseStream = webResponse.GetResponseStream();
        StreamReader responseStreamReader = new StreamReader(responseStream);
        String result = responseStreamReader.ReadToEnd();

        return result;
    }
    catch (Exception ex)
    {
        return null;
    }
}

当我这样做时,我得到:

您似乎没有启用 cookie。 MatWeb 需要启用 cookie。

好的,我明白了,所以我添加了几行:

CookieContainer cc = new CookieContainer();  
webRequest.CookieContainer = cc; 

我得到:

您的 IP 地址因过度使用而受到限制。当公司中的许多人或通过互联网服务提供商共享 IP 地址时,问题可能会变得更加复杂。对于给您带来的任何不便,我们深表歉意。

我可以理解这一点,但当我尝试使用网络浏览器访问此页面时,我没有收到此消息。我可以做什么来获取源代码?一些 cookie 或 http 标头?

I have url like:
http://www.matweb.com/search/DataSheet.aspx?MatGUID=849e2916ab1541be9ff6a17b78f95c82

I want to download source code from that page using this code:

private static string urlTemplate = @"http://www.matweb.com/search/DataSheet.aspx?MatGUID=";

static string GetSource(string guid)
{
    try
    {
        Uri url = new Uri(urlTemplate + guid);

        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
        webRequest.Method = "GET";               

        HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();

        Stream responseStream = webResponse.GetResponseStream();
        StreamReader responseStreamReader = new StreamReader(responseStream);
        String result = responseStreamReader.ReadToEnd();

        return result;
    }
    catch (Exception ex)
    {
        return null;
    }
}

When I do so I get:

You do not seem to have cookies enabled. MatWeb Requires cookies to be enabled.

Ok, that I understand, so I added lines:

CookieContainer cc = new CookieContainer();  
webRequest.CookieContainer = cc; 

I got:

Your IP Address has been restricted due to excessive use. The problem may be compounded when an IP address may be shared by many people in a company or through an internet service provider. We apologize for any inconvenience.

I can understand this but I'm not getting this message when I try to visit this page using web browser. What can I do to get the source code? Some cookies or http headers?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

書生途 2024-10-15 16:40:08

它可能不喜欢您的 UserAgent。试试这个:

webRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)"; //maybe substitute your own in here

It probably doesn't like your UserAgent. Try this:

webRequest.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)"; //maybe substitute your own in here
你与昨日 2024-10-15 16:40:08

如果您收到“过度使用”的回复,那么您似乎正在做公司不喜欢的事情。

It looks like you're doing something that the company doesn't like, if you got an "excessive use" response.

清旖 2024-10-15 16:40:08

您下载页面的速度太快。

当您使用浏览器时,您可能每秒最多只能浏览一页。使用应用程序,您可以每秒获取多个页面,这可能就是他们的网络服务器正在检测的内容。因此过度使用。

You are downloading pages too fast.

When you use a browser you might get up to one page per second. Using a application you can get several pages per second and that's probably what their web server is detecting. Hence the excessive usage.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文