抓取/模拟浏览帮助
我想制作一个程序来模拟用户浏览网站并单击链接。必须启用 Cookie 和 JavaScript。我已经在 python 中成功完成了此操作,但我想将其编写为可编译的语言(python ide 不会削减它)。网站上的链接是用 JavaScript 生成的,并且是动态的。在 python 中,我使用 PAMIE(使用 win32com 的第三方模块)来启动 Internet Explorer 的实例,抓取生成的 html 链接,然后导航到其中之一。重点是整个过程对服务器来说是透明的。执行此操作的最佳(可编译)语言和方法是什么?我正在考虑使用 WebBrowser 控件的 C#,但如果它不起作用,我不想花很多时间学习一些东西。任何善意的帮助将不胜感激!
I want to make a program that will simulate a user browsing a site and clicking on links. Cookies and javascript have to be enabled. I've successfully done this in python, but I want to write it an compilable language (python ide's don't cut it). The links on the site are generated with javascript and are dynamic. With python I used PAMIE (third party module that uses win32com) to launch an instance of Internet explorer, scrape the generated html for the links, then navigate to one of them. The point is for the whole process to be transparent to the server. What's the best (compilable) language and method to do this? I was thinking C# with WebBrowser control but I don't want to spend a lot of time learning something if it isn't going to work. Any kind help is appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可能想通过浏览器套件查看自动化测试:
http:// /www.teknologika.com/blog/the-holy-grail-net-automated-web-gui-testing-for-internet-explorer/
http://watin.sourceforge.net/
You might want to look at the automated testing via browser suites:
http://www.teknologika.com/blog/the-holy-grail-net-automated-web-gui-testing-for-internet-explorer/
http://watin.sourceforge.net/
我不久前写了一篇关于此的博客文章: Web scraping in . NET。其中讨论了 cookie,但没有讨论 JavaScript;我不知道这是否需要额外的编码。
I wrote a blog post on this awhile back: Web scraping in .NET. That discusses cookies but not JavaScript; I don't know if that would require additional coding.
可能值得一看 selenium 。
我们使用它在 C# asp.net 环境中进行 Web 测试。
文档是'太糟糕了
Might be worth having a look at selenium .
We use it for web testing in a C# asp.net envirnorment.
The documentation isn't to bad