如何将多个网站 html 表抓取到单个 Excel 文件中?
我有一个网址列表,我想将其抓取为 *.txt 格式。任何人都可以向我建议如何编写一个集成正则表达式的 php 代码,并将所有 html 表从列出的 url 中抓取到一个 excel 文件。我尝试过手动执行此操作,但由于网址数量很大,因此花费了我很多时间。
对于手动抓取,我已将 html 代码复制到记事本并保存为 html 文件,然后将该文件拖放到 excel 中,为我提供了我想要的 excel 文件。
请发送您的回复并提供正确的代码。
I have a list of urls I would like to scrape into *.txt format. Can anyone suggest to me how I can write a php code integrating regex and scraping all the html tables form the listed url to one excel file. I have tried doing this manually but since the urls are large in number its costing me a lot of time.
For the manual scraping I have copied the html code to notepad and saved as a html file and dragged and dropped the file to excel giving me a excel file I want.
Please send in your replies and provide the correct code to do so.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能想了解 Google 的 importHTML() 函数电子表格 - 一旦导入,您就会有一个 URL,您可以将其下载为 CSV(或其他格式),并以任何您想要的方式进行操作
You might like to look into the importHTML() function of Google Spreadsheets - once imported you then have a URL which you can download as CSV (or other formats) and manipulate pretty much anyway you want to