HttpwebRequest 模拟点击
我正在研究 httpwebrequest 并尝试搜索谷歌获取结果并模拟点击所需的链接。这可能吗?
string raw ="http://www.google.com/search?hl=en&q={0}&aq=f&oq=&aqi=n1g10";
string search = string.Format(raw, HttpUtility.UrlEncode(searchTerm));
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(search);
request.Proxy = prox;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.ASCII))
{
HtmlElementCollection html = reader.ReadToEnd();
browserA=reader.ReadToEnd();
this.Invoke(new EventHandler(IE1));
}
}
I was working on httpwebrequest and was trying to search google get result and simulate click to desired link. Is that possible?
string raw ="http://www.google.com/search?hl=en&q={0}&aq=f&oq=&aqi=n1g10";
string search = string.Format(raw, HttpUtility.UrlEncode(searchTerm));
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(search);
request.Proxy = prox;
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (StreamReader reader = new StreamReader(response.GetResponseStream(), Encoding.ASCII))
{
HtmlElementCollection html = reader.ReadToEnd();
browserA=reader.ReadToEnd();
this.Invoke(new EventHandler(IE1));
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
更好的选择是使用 google 的 API 之一。
这里有所有这些 API 的列表:Google API
这是 codeplex 上的另一个:Google Dot Net
他们提供允许应用程序自由使用 Google 的服务。其中大多数都有 wsdl 文件,您可以使用它们在 Visual Studio 中“添加 Web 引用”。
使用 Regex 和 HtmlAgility 包只能作为网站不公开公共服务时的最后手段(我最近不得不使用它来编写与 uTorrent 和 BtJunkie 集成的内容)。谷歌显然希望人们以这些方式开发他们的网站。
A better option is to use one of google's APIs.
There is a list of all of them here: Google APIs
Here is another on codeplex: Google Dot Net
They have services that allow applications to use google freely. With most of these there are wsdl files you can use to "Add Web Reference" in Visual Studio.
Using Regex and HtmlAgility pack should only be used as a last resort when a website does not expose public services (I had to use it recently for something I'm writing to integrate to uTorrent and BtJunkie). Google obviously wants people to develop with their sites in these ways.
您可以使用 http://htmlagilitypack.codeplex.com/ 或 http://www.justagile.com/linq-to-html.aspx (如果需要与此工具结合使用)来查找您想要“单击”的元素,然后使用此新元素处理 HttpWebRequest。它正在调用 http://en.wikipedia.org/wiki/Web_scraping。
另外,您还应该记住,如果大量请求来自您的 IP 地址,您的网络抓取资源可能会禁止您的 IP 地址,以避免您需要考虑使用代理服务器列表。
You could parse the page using http://htmlagilitypack.codeplex.com/ or http://www.justagile.com/linq-to-html.aspx (also you may use Regexps if needed in conjunction with this tools) to find elements you want to "Click" and then process HttpWebRequest with this new elements. It is calling http://en.wikipedia.org/wiki/Web_scraping.
Also you should remember that resource which you web scraping may ban your IP address if a lot of requests coming from your IP address, to avoid that you need to think about using list of proxy servers.