使用奇怪的 https 形式 RoR 进行机械化

发布于 2024-08-14 13:38:09 字数 331 浏览 8 评论 0原文

我正在使用 RoR 尝试使用mechanize在我的大学搜索一个简单的表单。该代码适用于搜索谷歌,但在结果中返回搜索表单?我真的很困惑。有什么建议吗?谢谢!

ruby script/console
require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl/")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
form.submit

I'm using RoR trying to search a simple form at my college using mechanize. The code works fine for searching google, but returns the search form in the results? I'm really confused. Any advice? Thanks!

ruby script/console
require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl/")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
form.submit

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

℡寂寞咖啡 2024-08-21 13:38:09

我已经解决了!当调用 form.submit 时,它假设 form.buttons 中的最后一个按钮是要使用的按钮。 form.buttons 中的最后一个按钮用于高级表单,因此生成的页面对象是另一种表单,尽管是更全面的高级搜索表单。

require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl/")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
result = agent.submit(form, form.buttons.first)

result.parser.css('table.cs-table-settings tr.tbl-class-fill-b td font b').map { |v| v.text.strip }

=> ["Principles of Chemistry", "Principles of Chemistry", "Principles of Chemistry", "Principles of Chemistry", …]

终于我们水落石出了! HTML 很糟糕,因此您需要为此戴上 XPath 帽子! :)

I've solved it! When form.submit is being called, it is assuming the last button in form.buttons is the button to use. The last button in form.buttons is for the advanced form, hence the resulting page object being another form, albeit the more comprehensive advanced search form.

require 'mechanize'
agent = WWW::Mechanize.new
agent.get("https://www.owens.edu/cgi-bin/class.pl/")
agent.page.forms
form = agent.page.forms.last
form.occ_subject = "chm"
result = agent.submit(form, form.buttons.first)

result.parser.css('table.cs-table-settings tr.tbl-class-fill-b td font b').map { |v| v.text.strip }

=> ["Principles of Chemistry", "Principles of Chemistry", "Principles of Chemistry", "Principles of Chemistry", …]

Finally we get to the bottom of it! The HTML is horrible, so you will need to put your XPath hat on for this one! :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文