提取所有输入参数的 JSP 页面爬虫
您是否知道有一个开源 Java 组件,它提供了扫描一组动态页面 (JSP) 的功能,然后从那里提取所有输入参数。当然,爬虫可以爬取静态代码,而不是动态代码,但我的想法是将其扩展为爬取网络服务器,包括所有服务器端代码。当然,我假设该工具可以完全访问爬行的网络服务器,而不是使用任何黑客手段。
这个想法是构建一个静态分析器,能够检测所有动态页面中的所有参数(request.getParameter() 等)字段。
Do you happen to know of an opensource Java component that provides the facility to scan a set of dynamic pages (JSP) and then extract all the input parameters from there. Of course, a crawler would be able to crawl static code and not dynamic code, but my idea here is to extend it to crawl a webserver including all the server-side code. Naturally, I am assuming that the tool will have full access to the crawled webserver and not by using any hacks.
The idea is to build a static analyzer that has the capacity to detect all parameters (request.getParameter() and such) fields from all dynamic pages.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您不能使用网络爬虫(基本上是HTML解析器)来提取请求参数。他们最多可以扫描 HTML 结构。例如,您可以使用 Jsoup 来实现:
当前打印
如果您想扫描 JSP 源代码Filter。
You cannot use a web crawler (basically, a HTML parser) to extract request parameters. They can at highest scan the HTML structure. You can use for example Jsoup for this:
This prints currently
If you want to scan the JSP source code for any forms/inputs, then you have to look in a different direction, it's definitely not to be called "web crawler". Unfortunately no such static analysis tool comes to mind. Closest what you can get is to create a
Filter
which logs all submitted request parameters.