如何以编程方式解析 HTML 文件并提交信息
ASP.NET 4 和C# 和
我想知道哪些代码、类可用于创建一个 WEB 应用程序,该应用程序可以:
01 - Connect to an HTML file on the web.
02 - Parse its content (text content).
03 - Find out specific content in a page (for example looking for specific keywords).
另外如何实现:
04 - How to submit information programmatically in HTML page (feeling forms).
我有兴趣了解类和一般实践以及完成此任务的代码。
如果您有任何想法请告诉我。再次感谢各位的支持! :-)
ASP.NET 4 & C# and
I would like to know which CODE, Classes could be useful for creating a WEB APPLICATION that could:
01 - Connect to an HTML file on the web.
02 - Parse its content (text content).
03 - Find out specific content in a page (for example looking for specific keywords).
Also how to implement:
04 - How to submit information programmatically in HTML page (feeling forms).
I am interested in understanding Classes and general practice and CODE for accomplish this task.
If you have any idea please let me know. Thanks guys once again for your support! :-)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不确定您是否希望您提到的所有内容都执行“服务器端”,但假设情况是这样:
查看
WebClient
类,以及
HttpWebRequest< /code>
类用于更高级的场景。
您可能需要查看 Html Agility Pack,或者如果 Bobince 没有注意到,正则表达式。
通常,这需要发送 HTTP POST 请求,这也可以通过
HttpWebRequest
类。I'm not sure if you want all of the things that you mention to execute 'server-side', but assuming that this is the case:
Check out the
WebClient
class, and theHttpWebRequest
class for more advanced scenarios.You might want to look at the Html Agility Pack, or if Bobince doesn't notice, regular expressions.
Typically, this will require sending a HTTP POST request, which too can be accomplished with the
HttpWebRequest
class.要解析网页,请查看 HTML Agility pack。
对于表单传递,您需要使用 Firebug 或 Internet Explorer 开发人员工具等工具,或者使用 Wireshark 等嗅探器来查看通过网络发送的内容。
在您的情况下,我还会考虑将其拆分为单独的组件,以便您可以轻松测试流程的各个部分。
For parsing the web page, have a look at the HTML Agility pack.
For form passing, you either need to use tools like Firebug or the Internet Explorer developer tools or use a sniffer like Wireshark to see what is sent via the network.
I would also consider in your case to consider to split it into seperate components so that you can easily test parts of the process.
使用 HttpWebRequest 调用对页面的请求网络。
然后您可以解析 HTML 响应。
要以编程方式提交表单,我认为您需要在客户端(JavaScript)执行此操作:
Use a HttpWebRequest to invoke a request to a page on the web.
You can then parse the HTML response.
To programmatically submit a form, i think you'll need to do it client-side (JavaScript):