阅读 RSS 源:聚合器做什么我不做
我将以下提要放入 Google Reader 中,它会正常更新。
http://www.indeed.ca/rss?q= &l=Hamilton%2C+ON
但是,当我使用网上建议的多种方法中的任何一种(仅涉及从此源读取并解析 XML)时,我都会收到相同的 20 个项目。
Google Reader 正在做什么,我应该在我的代码中以便收到新项目?
谢谢你的建议。顺便说一句,我正在用 Python 编码。
I drop the following feed into Google Reader, and it update normally.
http://www.indeed.ca/rss?q=&l=Hamilton%2C+ON
However, when I use any of a number of approaches suggested thither and yon on the 'net that simply involve reading from this source and parsing the XML I receive the same 20 items.
What is Google Reader doing that I should be in my code so that I receive new items?
Thanks for your advice. Incidentally, I'm coding in Python.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
RSS 聚合器“轮询”源,即它们定期对每个源重复 HTTP 查询,并检查结果中是否出现任何新内容。这是不幸的,就像民意调查一样,它在一系列无休止的“我们到了吗?”中浪费了资源。问题(有点像带着一个蹒跚学步的孩子长途开车;-),和仍然意味着延迟(如果你每小时轮询一个给定的来源,比如说,你将等待长达一个小时查看一些结果)。
不幸的是,在 RSS 架构本身中,没有其他选择,没有办法在新内容出现时请求“回调”,也没有办法选择更明智的“发布-订阅架构”。
pubsubhubbub 是一个很好的补救措施,但它不可避免地需要合作(超越RSS 标准)来自 RSS 源和聚合器 - 因此它需要非常广泛的采用才能被称为问题的“解决方案”,尽管从技术上来说,它已经是(对于合作站点;-)。
回到你的问题,你没有做错什么:你只需要定期轮询,就像 RSS 聚合器所做的那样,以便最终看到新的结果。
RSS aggregators "poll" the sources, i.e., they repeat the HTTP query periodically on each source, and check if anything new appears in the results. That's unfortunate, as polling always is, as it wastes resources in an unending series of "are we there yet?" questions (kind of like taking a toddler along in a long car drive;-), and nevertheless implies delays (if you poll a given source every hour, say, you'll wait up to an hour to see some results).
Unfortunately, in the RSS architecture itself, there are no alternatives, no way to ask for a "callback" when new stuff appears or opt for a saner "publish-subscribe architecture".
A good effort to remedy that is pubsubhubbub, but it inevitably requires cooperation (above and beyond the RSS standards) from RSS sources and aggregators -- so it needs very wide takeup before it can be called "a solution" to the problem, though, technically, it already is (for cooperating sites;-).
So back to your question, you're doing nothing wrong: you just need to poll periodically, like RSS aggregators do, in order to get to see new results eventually.
1) 您是否尝试过其他 RSS 提要?
2)如果是这样,这听起来像是某种缓存...您是否在某些代理后面?
1) Have you tried with other RSS feeds?
2) If so, it sounds like some kind of cache... Are you behind some proxy?