使用 Hpricot (Ruby on Rails) 抓取隐藏的 HTML（当visible = false 时）

发布于 2024-08-11 06:31:17 字数 648 浏览 15 评论 0原文

我遇到了一个问题，不幸的是我似乎无法超越，不幸的是，我也只是 Ruby on Rails 的新生儿，不幸的是，因此

我试图抓取网页的问题数量如下：

http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo.aspx

我想抓取下一页的地址、电话和 URL，在本例中，

http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo+Ismol.aspx

我一直在尝试我能想到的任何方法，但似乎没有任何效果，因为它们被设置为不可见等。

该地址位于 h3 标记内，但它似乎不可废弃。我也一直在从以下网址http://www.rubyrailways.com/ajax-scraping-with-scrubyt-linkedin-google-analytics-yahoo-suggestions/研究ScRUBYt，但我似乎真的找不到如何在这种情况下应用它们的头或尾。

我真的很感激任何指示，因为这是我真正需要克服的障碍，以便继续完成我的任务。预先感谢您的任何帮助。

原文

I've come across an issue which unfortunately I can't seem to surpass, I'm also just a newborn to Ruby on rails unfortunately hence the number of questions

I am attempting to scrape a webpage such as the following:

http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo.aspx

I would like to scrape The Addresses, Phones and URL of the next Page which in this case is

http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo+Ismol.aspx

I've been trying just about anything i could think of but nothing seems to work due to them being set to invisible or so.

The Address is within an h3 tag but it does not appear to be scrap-able. I've been also looking into ScRUBYt from the following url http://www.rubyrailways.com/ajax-scraping-with-scrubyt-linkedin-google-analytics-yahoo-suggestions/, but i really cant seem to find heads or tails of how to apply them in this case.

I would really appreciate any pointers as this is an obstacle which i really need to surpass in order to move forward on my assignment. Thanks in advance for any help.

分享到QQ

分享到微博