任何现有的 RSS feed url 验证器?
在我开始编写验证器来检查 URL 是否确实指向 RSS 提要之前,我搜索了一些可能存在的验证器,但没有找到任何可靠的验证器。
我只是想问问社区,你们是否知道有一个通过 URL 进行 RSS 验证的工具?
如果我自己写,你有什么建议?
我正在考虑只检查一行文本的第一个实例并确保它定义 然后也许检查下一项是
节点。
您对此有何想法?是否存在提要不遵循上述语法的情况?
另请注意,我尝试使用的一种方法如下:
$valid = true;
try{
$content = file_get_contents($feed);
if (!simplexml_load_string($content)){
$valid = false;
}
} catch (Exception $e){
$valid = false;
}
不幸的是,我似乎无法抑制警告(error_reporting(0)
不起作用..),因此只会向我发送警告垃圾邮件。
解决方案
对于任何感兴趣的人,我使用了 W3C Validator API
$url = "http://feed_url.com";
$validator = "http://validator.w3.org/feed/check.cgi";
$validator .= "?url=".$url;
$validator .= "&output=soap12";
$response = file_get_contents($validator);
$a = strpos($response, '<m:validity>', 0)+12;
$b = strpos($response, '</m:validity>', $a);
$result = substr($response, $a, $b-$a);
echo $result;
这将相应地返回 true 或 false。
Before I dive into writing a validator to check if a URL is actually pointing to an RSS feed, I did a bit of searching for some validators that may exist out there but had little luck with any reliable ones.
I just wanted to ask the community if any of you know of an RSS validator by URL?
If I were to write my own, what do you suggest?
I was thinking of just checking for the first instance of a line of text and making sure it defines <?xml version="1.0" encoding="UTF-8"?>
and then perhaps checking that the next item is an <rss>
node.
What are your thoughts here? Could there ever be a case where a feed may not follow the syntax stated above?
Also note, one method I attempted to use was the following:
$valid = true;
try{
$content = file_get_contents($feed);
if (!simplexml_load_string($content)){
$valid = false;
}
} catch (Exception $e){
$valid = false;
}
Unfortunately it seems that I cannot suppress warnings (error_reporting(0)
is not working..) so the just spams me with warnings.
SOLUTION
For anyone that is interested, I used the W3C Validator API
$url = "http://feed_url.com";
$validator = "http://validator.w3.org/feed/check.cgi";
$validator .= "?url=".$url;
$validator .= "&output=soap12";
$response = file_get_contents($validator);
$a = strpos($response, '<m:validity>', 0)+12;
$b = strpos($response, '</m:validity>', $a);
$result = substr($response, $a, $b-$a);
echo $result;
This will return true or false accordingly.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
W3C Feed 验证服务提供 SOAP 接口。从关于页面:
The W3C Feed Validation Service offers a SOAP interface. From the About page:
我会这样做:
它是有效的 XML 吗?如果是,请继续。
顶级元素是 rss 还是 feed?如果是这样,那么它就是一个提要。如果不是,则不是。
这涵盖了除 1.0 之外的所有 RSS 版本以及 Atom 的所有版本。
RSS 1.0 更加困难,因为它的顶级元素是 RDF,而且这是一种比 RSS 更通用的格式,因此您必须更深入地了解 RSS 的迹象。但幸运的是,现在 RSS 1.0 并不多,大多数是 RSS 2.0 或 Atom 1.0。
希望这会有所帮助,包括通常的免责声明,我不是律师等。
I would do this:
Is it valid XML? If so, continue.
Is the top-level element either rss or feed? If so, it's a feed. If not, it's not.
That covers all versions of RSS except 1.0 and all versions of Atom.
RSS 1.0 is more difficult since its top level element is RDF, and that's a more generic format than RSS, so you'd have to look deeper for indications of RSS-ness. But luckily there's not much RSS 1.0 out there these days, most of it is RSS 2.0 or Atom 1.0.
Hope this helps, with the usual disclaimers, I am not a lawyer, etc.