任何现有的 RSS feed url 验证器?

发布于 2024-12-06 22:49:39 字数 1102 浏览 0 评论 0原文

在我开始编写验证器来检查 URL 是否确实指向 RSS 提要之前,我搜索了一些可能存在的验证器,但没有找到任何可靠的验证器。

我只是想问问社区,你们是否知道有一个通过 URL 进行 RSS 验证的工具?

如果我自己写,你有什么建议?

我正在考虑只检查一行文本的第一个实例并确保它定义 然后也许检查下一项是 节点。

您对此有何想法?是否存在提要不遵循上述语法的情况?

另请注意,我尝试使用的一种方法如下:

$valid = true;

try{
    $content = file_get_contents($feed);
    if (!simplexml_load_string($content)){
        $valid = false;
    }
} catch (Exception $e){
    $valid = false;
}

不幸的是,我似乎无法抑制警告(error_reporting(0) 不起作用..),因此只会向我发送警告垃圾邮件。


解决方案

对于任何感兴趣的人,我使用了 W3C Validator API

$url = "http://feed_url.com";
$validator = "http://validator.w3.org/feed/check.cgi";
$validator .= "?url=".$url;
$validator .= "&output=soap12";

$response = file_get_contents($validator);
$a = strpos($response, '<m:validity>', 0)+12; 
$b = strpos($response, '</m:validity>', $a); 
$result = substr($response, $a, $b-$a); 
echo $result;

这将相应地返回 true 或 false。

Before I dive into writing a validator to check if a URL is actually pointing to an RSS feed, I did a bit of searching for some validators that may exist out there but had little luck with any reliable ones.

I just wanted to ask the community if any of you know of an RSS validator by URL?

If I were to write my own, what do you suggest?

I was thinking of just checking for the first instance of a line of text and making sure it defines <?xml version="1.0" encoding="UTF-8"?> and then perhaps checking that the next item is an <rss> node.

What are your thoughts here? Could there ever be a case where a feed may not follow the syntax stated above?

Also note, one method I attempted to use was the following:

$valid = true;

try{
    $content = file_get_contents($feed);
    if (!simplexml_load_string($content)){
        $valid = false;
    }
} catch (Exception $e){
    $valid = false;
}

Unfortunately it seems that I cannot suppress warnings (error_reporting(0) is not working..) so the just spams me with warnings.


SOLUTION

For anyone that is interested, I used the W3C Validator API

$url = "http://feed_url.com";
$validator = "http://validator.w3.org/feed/check.cgi";
$validator .= "?url=".$url;
$validator .= "&output=soap12";

$response = file_get_contents($validator);
$a = strpos($response, '<m:validity>', 0)+12; 
$b = strpos($response, '</m:validity>', $a); 
$result = substr($response, $a, $b-$a); 
echo $result;

This will return true or false accordingly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

狠疯拽 2024-12-13 22:49:39

W3C Feed 验证服务提供 SOAP 接口。从关于页面:

是否有具有此服务的公共 API 的 Web 服务?

是的,有一个 SOAP 接口,可以通过使用查询访问
常规查询顶部的参数输出=“soap12”。 SOAP 1.2 Web
服务 API 文档
包含更多详细信息。

The W3C Feed Validation Service offers a SOAP interface. From the About page:

Is there a Web Service with a public API for this service?

Yes, there is a SOAP interface, accessible by using the query
parameter output="soap12" on top of a regular query. The SOAP 1.2 Web
Service API documentation
has more details.

过度放纵 2024-12-13 22:49:39

我会这样做:

  1. 它是有效的 XML 吗?如果是,请继续。

  2. 顶级元素是 rss 还是 feed?如果是这样,那么它就是一个提要。如果不是,则不是。

这涵盖了除 1.0 之外的所有 RSS 版本以及 Atom 的所有版本。

RSS 1.0 更加困难,因为它的顶级元素是 RDF,而且这是一种比 RSS 更通用的格式,因此您必须更深入地了解 RSS 的迹象。但幸运的是,现在 RSS 1.0 并不多,大多数是 RSS 2.0 或 Atom 1.0。

希望这会有所帮助,包括通常的免责声明,我不是律师等。

I would do this:

  1. Is it valid XML? If so, continue.

  2. Is the top-level element either rss or feed? If so, it's a feed. If not, it's not.

That covers all versions of RSS except 1.0 and all versions of Atom.

RSS 1.0 is more difficult since its top level element is RDF, and that's a more generic format than RSS, so you'd have to look deeper for indications of RSS-ness. But luckily there's not much RSS 1.0 out there these days, most of it is RSS 2.0 or Atom 1.0.

Hope this helps, with the usual disclaimers, I am not a lawyer, etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文