在公式字段中输入字符串并获取结果文本的最佳非交互式方法
在我有权访问的一些 网站 中,有一些输入字段。在第六个字段中,我需要从 10000 个字符串的列表中输入一些输入字符串,然后出现一个新页面,我只需要计算行数。最后,我想得到一个包含两列的表格,例如输入字符串和结果行数。由于我必须手动输入所有不同的 10000 个字符串的信息,因此我想知道将字符串输入通用公式字段并获取结果文本的最佳方法是什么。我听说过curl,但我不确定这是否是最简单的。
聚苯乙烯 交互方式的示例:我在谷歌搜索中输入一些字符串或单词,然后我得到一个包含搜索结果的新页面。之前我已经介绍了我的谷歌用户名和密码,因此结果可能会根据我的个人资料进行过滤。
非交互式方式的示例:脚本以某种方式引入我的用户信息、搜索查询并将搜索结果保存到某个文本文件中。想象一下同样的想法,但是对于像 这个。
In some website for which I have access, there are some input fields. In the sixth field I need to enter some input string from a list of 10000 strings, then a new page appears, for which I would just need to count the number of lines. Finally I would like to get a table with two columns like input string and number of resulting lines. Since I have to manually enter the info for all the different 10000 strings, I wonder therefore what is the best approach to enter a string into a generic formular field and get the resulting text. I heard about curl but I am not sure whether this is the easiest one.
P.S.
Example of interactive way: I type some string o words into google search and then I get a new page with the search results. Previously I have introduced my google username and password, so the results will be probably filtered according to my profile.
Example of non-interactive way: A script somehow introduces my user information, search query and saves to some text file the search results. Imagine the same idea but for a more complicated website like this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您想要做的是发送包含特定数据的 HTTP POST。这可以使用任何适当的 HTTP 客户端代码来完成,其中之一是 libcurl (或 libcurl ="http://pycurl.sourceforge.net/" rel="nofollow">pycurl 绑定,甚至使用 curl 命令行 工具)。根据帖子的响应,您可能会收到重定向,然后收到结果,或者您需要对结果进行单独的请求,然后完成并返回执行下一个帖子。重复此操作,直到完成所有 POST。
您可能需要考虑的是,您可能必须处理 cookie,并且可能需要遵循 POST 的重定向。一个好的方法是记录使用浏览器完成的“手动会话”(使用 firebug 或 LiveHTTPHeaders 等),然后使用该记录来帮助您使用 HTTP 客户端重复相同的操作。
可以在这里找到有关此类工作的一些启动详细信息的不错的教程: http:// curl.haxx.se/docs/httpscripting.html
What you want to do is to send a HTTP POST with specific data. This can be done with any proper HTTP client code, and one such is libcurl (or the pycurl binding or even using the curl command line tool). On the response from the post, you probably get a redirect and then the results, or you need to do a separate request for the results and then you're done and go back to do the next POST. Repeat until all POSTs are done.
What you may need to take into account is that you may have to deal with cookies and possibly to follow a redirect from the POST. A good approach is to record a "manual session" as done with a browser (use firebug or LiveHTTPHeaders etc) and then use that recording to help you repeat the same thing with a HTTP client.
A decent tutorial to get some starting up details on this kind of work can be found here: http://curl.haxx.se/docs/httpscripting.html
您还可以使用 JMeter 来运行所有帖子。您可以使用 CSV 输入来设置 10000 个字符串。然后将结果保存为 xml 并提取必要的数据。
You could also use JMeter to run all the posts. You may use the CSV input to set the 10000 strings. Then you save the result as xml and extract the necessary data.