Yahoo Pipes:根据文本文件中的单词过滤提要中的项目
我有一个管道,可以过滤 RSS 提要并删除任何包含我选择的“停用词”的项目。目前,我已经在管道编辑器中为每个停用词手动创建了一个过滤器,但更合乎逻辑的方法是从文件中读取它们。我已经弄清楚如何从文本文件中读取停用词,但是如何将过滤器运算符应用于提要,对每个停用词应用一次?
文档明确指出运算符不能在循环内应用构造,但希望我在这里遗漏了一些东西。
I have a pipe that filters an RSS feed and removes any item that contains "stopwords" that I've chosen. Currently I've manually created a filter for each stopword in the pipe editor, but the more logical way is to read these from a file. I've figured out how to read the stopwords out of the text file, but how do I apply the filter operator to the feed, once for every stopword?
The documentation states explicitly that operators can't be applied within the loop construct, but hopefully I'm missing something here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你没有错过任何东西 - 过滤器运算符不能进入循环。
您最好的选择可能是从停用词中生成正则表达式并使用它进行过滤。例如生成一个类似
(word1|word2|word3|...|wordN)
的字符串。您可能必须转义任何奇怪的字符。另外,我不确定正则表达式可以有多长,因此您可能必须将其分成多个过滤规则。
You're not missing anything - the filter operator can't go in a loop.
Your best bet might be to generate a regex out of the stopwords and filter using that. e.g. generate a string like
(word1|word2|word3|...|wordN)
.You may have to escape any odd characters. Also I'm not sure how long a regex can be so you might have to chunk it over multiple filter rules.
除了 Gavin Brock 的回答之外,还有以下 Yahoo Pipes
根据多个停用词过滤提要项目(标题、描述、链接和作者):
输入
In addition to Gavin Brock's answer the following Yahoo Pipes
filters the feed items (title, description, link and author) according to multiple stopwords:
Inputs