使用 PHP 从必须登录的网站 (Reddit) 抓取和使用数据?
我想创建一个网页,给定两个 reddit 用户名及其密码,让 user2 订阅 user1 订阅的所有 subreddit。所以我需要:
- 获取 user1 订阅的 subreddits。
- 订阅 user2 到那些 reddit
我有使用 PHP 的经验,但我没有抓取(特别是当用户必须登录时)以及提交用户“订阅”子 reddit 所需的信息类型的经验。有人对如何做到这一点有任何想法吗?
问候,
蒂姆
I would like to create a webpage that, given two reddit usernames and their passwords, subscribes user2 to all of the subreddits that user1 is subscribed to. So I need to:
- Get the subreddits that user1 is subscribed to.
- Subscribe user2 to those reddits
I have experience using PHP, but I have no experience with scraping (especially when the user must be logged in) and also submitting the type of information that would be necessary to "subscribe" a user to a subreddit. Does anyone have any ideas on how this can be done?
Regards,
Tim
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
假设这不违反 reddits 的服务条款,使用
cURL
登录,人们可能可以轻松地regex
获取必要的信息。从那里开始,需要检查 reddit 如何订阅收藏夹,以及导航到正确的 URL 或发布表单数据。我将其称为中级任务,只要它不违反 reddit 服务条款。
Assuming this isn't against reddits' terms of service, using
cURL
to login, one could probably easilyregex
the necessary information. From there it's a matter of checking how reddit subscribes favorites and either navigating to the proper urls or posting form data.I'd call it a medium-level task, as long as it's not against the reddit terms of service.
开源产品 TestPlan 非常擅长此类事情。使用一种简单的语言,您可以使用一个用户登录该网站,获取子版块的名称,然后以其他用户身份登录以订阅组。
例如,如果您只想要顶部条目的标题,您可以使用以下代码:
它会产生如下输出:
The open source product TestPlan is very good at such things. Using a simple language you can login to the site with one user, grab the names of the subreddits, then login as th other user to subscribe to the groups.
For example, if you just wanted the titles of the top entries you could use this code:
Which produces output like this: