通过 HTTP 请求时,Reddit RSS 提要返回的项目较少?
我正在尝试从包含 25 个项目的 RSS 提要中读取数据。当我通过 HTTP 请求 RSS 文件时,它说只有 20 个项目。
function test($location)
{
$doc = new DomDocument();
$doc->load($location);
$items = $doc->getElementsByTagName('item');
return $items->length;
}
// Prints 20
echo test('http://www.reddit.com/r/programming/new/.rss?after=t3_');
// Prints 25
echo test('programming.xml');
我也尝试过其他 Reddit 子版块的 RSS 提要,结果相同。
I am trying to read data from an RSS feed which has 25 items. When I request the RSS file through HTTP it says there are only 20 items.
function test($location)
{
$doc = new DomDocument();
$doc->load($location);
$items = $doc->getElementsByTagName('item');
return $items->length;
}
// Prints 20
echo test('http://www.reddit.com/r/programming/new/.rss?after=t3_');
// Prints 25
echo test('programming.xml');
I've tried RSS feeds from other subreddits as well with the same result.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我明白现在的问题是什么了...如果您访问像 /r/ 这样的 sub-reddit programming/ 并转到“新”选项卡查看最新提交的内容,有两个排序选项。第一个选项是“上升”,仅显示即将出现的条目,替代排序顺序是“新”。
由于我在浏览器中选择了“新”排序顺序,因此它保存了一个 cookie,并随后用作默认排序顺序。但是,通过代码访问页面仍然使用默认排序顺序,这会返回可变数量的结果。
我通过将排序顺序查询字符串附加到请求网址解决了该问题:
http://www.reddit.com/r/programming/new/.rss?sort=new
I see what the issue is now... If you visit a sub-reddit like /r/programming/ and go to the "new" tab to see newest submissions, there are two sorting options. The first option is "rising" which only shows up-and-coming entries, the alternate sort order is "new".
Since I chose the "new" sort order in my browser it saved a cookie and was used as the default sort order afterwards. However, accessing the page through code was still using the default sort order, which returned a variable amount of results.
I resolved the issue by appending the sort order query string to the request url:
http://www.reddit.com/r/programming/new/.rss?sort=new
如果加载提要时遇到问题,它可能会发出某种警告。
现在,您的 reddit feed 示例代码显示它有 14 个项目。该提要中的项目数量不是恒定的。所以问题是你的本地副本与你从 reddit 加载的副本不同。
If it were having problems loading the feed, it'd probably issue a warning of some sort.
Right now, your sample code for the reddit feed shows that it has 14 items. The number of items in that feed is not constant. So the issue is that your local copy is different that the one you were loading from reddit.