获取“远程服务器返回错误:(503) 服务器不可用”在 ttp://toolbarqueries.google.com/search?q=info:(域名)
我正在尝试创建一个 Windows 服务。该服务的目的是从数据库中获取网址并从谷歌检查其页面排名。目的是抓住任何伪造页面排名的人。我在 http://www.codeproject.com/KB/aspnet/Google_Pagerank 找到了一些代码。 aspx 并使用了它。
现在这里是代码
public static int GetPageRank()
{
string file = "http://toolbarqueries.google.com/search?q=info:codeproject.com";
try
{
//Request PR from Google
WebRequest request = WebRequest.Create(file);
WebResponse response = request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string data = reader.ReadToEnd();
reader.Close();
response.Close();
//Parse PR from string
int pageRank = -1;
if (data.IndexOf(':') != -1)
{
data = data.Substring(data.LastIndexOf(':') + 1);
}
int.TryParse(data, out pageRank);
return pageRank;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
return -1;
}
}
现在发生的事情是,当经过一些尝试(例如 100 次尝试)后调用此方法时,我开始出现以下异常。 “远程服务器返回错误:(503) 服务器不可用”。我做了一些研究,并且也看到了有关堆栈溢出的相关问题。显然,如果许多请求来自同一 IP,谷歌将停止服务请求。是否有任何解决方法可以让我在两小时或三小时内检查数千个页面排名。
I am trying to create a windows service. The purpose of service is to pick up urls from a database and check their page rank from google. The purpose is to catch any one faking their page ranks. I found some code at http://www.codeproject.com/KB/aspnet/Google_Pagerank.aspx and used it.
Now here is the code
public static int GetPageRank()
{
string file = "http://toolbarqueries.google.com/search?q=info:codeproject.com";
try
{
//Request PR from Google
WebRequest request = WebRequest.Create(file);
WebResponse response = request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string data = reader.ReadToEnd();
reader.Close();
response.Close();
//Parse PR from string
int pageRank = -1;
if (data.IndexOf(':') != -1)
{
data = data.Substring(data.LastIndexOf(':') + 1);
}
int.TryParse(data, out pageRank);
return pageRank;
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
return -1;
}
}
Now what is happening is this when this method is called after some tries like 100 tries i start getting following exception. "The remote server returned an error: (503) Server Unavailable". I have done some research and i have seen a related question on stack overflow as well. Apparently google stops serving requests if to many of them originate from a same ip. Are there any work arounds to it that will enable me to check several thousand pageranks in say two hours or three hours.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
没有。您只是请求了太多数据。可能有一个 JSON 或 XML API 来获取批量响应,但我不知道 Google 提供了什么。
Nope. You're simply requesting too much data. There might be a JSON or XML API to get batch responses, but I am not aware of any from Google.
最后我们所做的是从代理提供商处获取代理并使用它们。必须使用信号量,以便为所有线程分配一个新代理,同时确保代理每分钟使用次数不超过 3 次,并且代理以循环顺序方式轮换。对此没有其他解决办法。
Finally what we did was get proxies from a proxy provider and use them. Had to use a semaphore so that all the threads would be assigned a new proxy while ensuring that a proxy is not used more that 3 times a minute and proxies are rotated in circular sequential manner. There is no other work around to this.