发现网站的 feed URL

发布于 2024-10-31 12:37:30 字数 1604 浏览 1 评论 0原文

如何发现网站的 Feed URL?

当我抓取 Microsoft 博客 HTML 时,我可以看到以下内容:

<link rel="alternate" type="application/rss+xml" title="Site Home (RSS 2.0)" href="http://blogs.technet.com/rss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g&#39;s Comments (RSS 2.0)" href="/members/B1ackD0g/comments/rss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g&#39;s Activities (RSS 2.0)" href="/members/B1ackD0g/activities/rss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="Activities of People B1ackD0g Follows (RSS 2.0)" href="/members/B1ackD0g/activities/followersrss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g&#39;s Groups Activities (RSS 2.0)" href="/members/B1ackD0g/activities/groupsrss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="The Official Microsoft Blog – News and Perspectives from Microsoft (RSS 2.0)" href="http://blogs.technet.com/b/microsoft_blog/rss.aspx"  />
<link rel="alternate" type="application/atom+xml" title="The Official Microsoft Blog – News and Perspectives from Microsoft (Atom 1.0)" href="http://blogs.technet.com/b/microsoft_blog/atom.aspx"  />

我可以假设的是我可以查找带有以“http://blogs.technet.com/b/ 开头的 href 的标签microsoft_blog/"

这个假设安全吗?

我需要做的基本上是获取一个 URL 并返回其提要 URL。

How can I discover a website's feed URL?

When I grab Microsoft's blog HTML, I can see this:

<link rel="alternate" type="application/rss+xml" title="Site Home (RSS 2.0)" href="http://blogs.technet.com/rss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g's Comments (RSS 2.0)" href="/members/B1ackD0g/comments/rss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g's Activities (RSS 2.0)" href="/members/B1ackD0g/activities/rss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="Activities of People B1ackD0g Follows (RSS 2.0)" href="/members/B1ackD0g/activities/followersrss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="B1ackD0g's Groups Activities (RSS 2.0)" href="/members/B1ackD0g/activities/groupsrss.aspx"  />
<link rel="alternate" type="application/rss+xml" title="The Official Microsoft Blog – News and Perspectives from Microsoft (RSS 2.0)" href="http://blogs.technet.com/b/microsoft_blog/rss.aspx"  />
<link rel="alternate" type="application/atom+xml" title="The Official Microsoft Blog – News and Perspectives from Microsoft (Atom 1.0)" href="http://blogs.technet.com/b/microsoft_blog/atom.aspx"  />

What I can assume here is that I can look for tags with hrefs that starts with "http://blogs.technet.com/b/microsoft_blog/"

Is this safe to assume?

What I need to do is basically get a URL and return its feed URL.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

┈┾☆殇 2024-11-07 12:37:30

没有安全的方法可以在不知道的情况下假设网站的 feed url 是什么。在此示例中,属性 type 值似乎足以确定 Feed,但不能保证在示例之外进行设置。您可以通过搜索包含 RSS 的链接的标记来尝试猜测,甚至可以针对 feedburner http://feeds 等服务进行测试。 feedburner.com/somedomain 但您仍然不能确定。

There is no safe way to assume what a website's feed url is without knowing it. In this example the attribute type value seems to be enough to determine the feed but that is not guaranteed to be set outside of your example. You can try and guess by searching the markup for links containing RSS or even testing against a service like feedburner http://feeds.feedburner.com/somedomain but you still can not be sure.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文