保存网页中的动态内容？

发布于 2024-07-30 07:56:54 字数 213 浏览 17 评论 0原文

是否可以从网站保存动态文本并将其转储到我的服务器上的文件中？我感兴趣的具体情况是从此页面保存歌曲标题 http://www.z1035 .com/player.php 并将所有歌曲标题保存在我的服务器上的文件中。这可能吗？我可以使用什么方法来做到这一点？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

白芷 2024-08-06 07:56:54

您所指的通常称为“抓取”。这是一篇关于使用 PHP 执行此操作的一种方法的文章：

http://www.developertutorials.com/blog/php/easy-screen-scraping-in-php-simple-html-dom-library-simplehtmldom-398/

回复收藏 0 原文

屋顶上的小猫咪 2024-08-06 07:56:54

在我看来，Python 的 URLLib 库使抓取变得非常容易。

import urllib, re

url = "http://www.z1035.com/player.php"
f = urllib.urlopen(url)
t = f.read()
#  use regular expression here 
m = re.search(t, "some pattern")
print m.group(1)

这将加载外部资源，就好像它是本地文件一样，并允许您根据需要对其进行解析。

曾几何时，我想保存我听过的广播节目的所有曲目列表。我使用 Python 下载所有曲目列表，然后以编程方式访问每个曲目并将内容附加到文件中。它非常方便，大约需要 20 行。

Python's URLLib library makes scraping pretty easy, in my opinion.

import urllib, re

url = "http://www.z1035.com/player.php"
f = urllib.urlopen(url)
t = f.read()
#  use regular expression here 
m = re.search(t, "some pattern")
print m.group(1)

This will load the external resource as if it were a local file, and allow you to parse it as necessary.

Once upon a time I wanted to save all the tracklistings for a radio show I listened to. I used Python to download a list of all the tracklistings, and then to programmatically visit each and append the contents to a file. It was very handy, and took probably 20 lines.

回复收藏 0 原文

~没有更多了~