使用通配符和 php 进行抓取
我很难想象和想象如何抓取此页面:http://www.morewords。 com/ends-with/aw 代表单词本身。给定一个 URL,我想获取内容,然后生成一个包含所有列出的单词的 php 数组,在源代码中看起来像
<a href="/word/word1/">word1</a><br />
<a href="/word/word2/">word2</a><br />
<a href="/word/word3/">word3</a><br />
<a href="/word/word4/">word4</a><br />
有几种方法我一直在考虑这样做,如果你能,我将不胜感激帮助我决定最有效的方法。另外,我很感激有关如何实现这一目标的任何建议或示例。我知道这并不复杂,但我可以得到你们高级黑客的帮助。
- 使用某种 jquery
$.each()
循环并以某种方式将它们放入 JS 数组中,然后转录(可能很繁重) - 使用某种curl(真的没有太多经验与curl)
- 使用一些复杂的查找并用正则表达式替换。
I have a hard time visualizing and conceiving away to scrape this page: http://www.morewords.com/ends-with/aw for the words themselves. Given a URL, I'd like to get the contents and then generate a php array with all the words listed, which in the source look like
<a href="/word/word1/">word1</a><br />
<a href="/word/word2/">word2</a><br />
<a href="/word/word3/">word3</a><br />
<a href="/word/word4/">word4</a><br />
There are a few ways I have been thinking about doing this, i'd appreciate if you could help me decide the most efficient way. Also, i'd appreciate any advice or examples on how to achieve this. I understand it's not incredibly complicated, but I could use the help of you advanced hackers.
- Use some sort of jquery
$.each()
to loop through and somehow case them into a JS array, and then transcribe (probably heavily taxing) - use some sort of curl (don't really have much experience with curl)
- use some sophisticated find and replace with regex.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您将其标记为 PHP,因此这是一个 PHP 解决方案:)
CodePad。
如果
allow_url_fopen
在 php.ini 中被禁用,您可以使用 cURL 来获取 HTML。You tagged it as PHP, so here is a PHP solution :)
CodePad.
If
allow_url_fopen
is disabled inphp.ini
, you could use cURL to get the HTML.