需要一些帮助来创建从 rss feed 中删除某些元素的 Yahoo Pipe。
澄清一下:我会在 Yahoo Pipes 上使用正则表达式代码。我认为正则表达式语法是通用的?
我已将问题分解为一些子问题:
-
用于删除/剥离特定 html 标记(有其自己的类)的正则表达式是什么?
内容
如何从链接图像中删除链接但保留图像标记?
-
如何将顺序类添加到提要项目中找到的所有链接?
如果单个提要项目中有 5 个链接,则将为它们指定类:link001、link002、link003、link004、link005...
由于新帐户限制,代码示例可以在此处找到:
在雅虎管道中使用正则表达式
正则表达式不完全是我的强项......所以任何帮助都会很大赞赏!
多谢!
Need some help in creating a Yahoo Pipe that strips certain elements from an rss feed.
To clerify: I would use the regex code on Yahoo Pipes. I presume the regex syntax is universal?
I've broken the question up to some sub-questions:
-
What would be the regex for removing/striping a specific html tag (has its own class)?
Content
-
How can I strip links from linked images but keep image markup?
-
How can I add sequential classes to all links found in a feed item?
If there are 5 links in a single feed item, they would be given classes: link001, link002, link003, link004, link005...
Due to new account limitation code examples can be found here:
Using Regex in Yahoo pipes
Regex is not exactly my forte... so any help would be greatly appreciated!
Thanks a lot!
发布评论
评论(1)
正则表达式语法当然不是通用的。请参阅我的正则表达式风格比较。不幸的是,雅虎管道文档没有说明他们使用什么正则表达式风格。这些示例看起来像 Perl 风格的正则表达式,所以这就是我将使用的。
要删除具有特定类属性(例如
someclass
)的特定 HTML 标记(例如span
),请搜索:并替换为:
如果
span 标记包含嵌套的
span
标记。要删除内容中以
img
标记为第一内容的任何a
标记,请搜索:并替换为:
您问题中的第三项不能用常规方法完成单独表达。您需要一种工具来增加替换中的数字。我不知道 Yahoo Pipes 是否支持类似的东西。你真的不需要正则表达式。只需搜索文本
并替换为
当然,有关使用正则表达式操作 HTML/XML 的所有注意事项都适用。正则表达式适用于您提供的示例,但它们可能无法按预期适用于每个可能的 HTML 片段。
Regular expression syntax certainly isn't universal. See my regex flavor comparison. Unfortunately the Yahoo Pipes docs don't say what regex flavor they use. The examples look like Perl-style regexes, so that's what I'll use.
To remove a specific HTML tag (e.g.
span
) with a specific class attribute (e.g.someclass
), search for:and replace with:
The above regex will fail if the
span
tag you're trying to remove contains a nestedspan
tag.To delete any
a
tag that has animg
tag as the first thing in its content, search for:and replace with:
The third item in your question cannot be done with regular expressions alone. You'll need a facility to increment the number in the replacement. I don't know if Yahoo Pipes supports something like that. You don't really need a regex. Simply search for the text
<a
and replace with<a class="link001"
Of course, all the caveats about manipulating HTML/XML with regular expressions apply. The regexes work on the examples you gave, but they may not work as intended on every possible piece of HTML.