使用 sed 过滤表情符号
我在 Win 上使用 cygwin grep 有一个 grep 表达式。
grep -a "\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u" all_fbs.txt > rockon_fbs.txt
然而,一旦我确定了表情符号类别,我想将它们从数据中删除。但是,上面 sed 中的相同正则表达式会导致语法错误(是的,我意识到我可以使用 /d 而不是 //g,但这没有什么区别,我仍然收到错误。)
sed "s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g"
完整的行是:
grep -a "\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u" all_fbs.txt | sed "s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g" | sed "s/^/ROCKON\t/" > rockon_fbs.txt
结果是:
sed: -e expression #1, char 14: unknown option to `s'
我知道它来自 sed regexp 我正在询问它 b/c 如果我删除整行的那部分,那么我不会收到错误(但是,当然,表情符号不会被过滤掉) 。
提前致谢,
史蒂夫
I have a grep expression using cygwin grep on Win.
grep -a "\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u" all_fbs.txt > rockon_fbs.txt
Once I identify the emoticon class, however, I want to strip them out of the data. However, the same regexp above within a sed results in a syntax error (yes, I realize I could use /d instead of //g, but this doesn't make a difference, I still get the error.)
sed "s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g"
The full line is:
grep -a "\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u" all_fbs.txt | sed "s/\(\\,,/\|\\m/\|\\m/\\>\.</\\m/\|:u\)*//g" | sed "s/^/ROCKON\t/" > rockon_fbs.txt
The result is:
sed: -e expression #1, char 14: unknown option to `s'
I know it's coming from the sed regexp I'm asking about it b/c if I remove that portion of the full line, then I get no error (but, of course, the emoticons are not filtered out).
Thanks in advance,
Steve
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您需要转义
/
否则它将提前终止表达式。您还应该使用单引号字符串而不是双引号字符串,以防止 shell 解释反斜杠:
因此请尝试以下操作:
You need to escape
/
otherwise it will prematurely terminate the expression.You should also use single-quoted strings instead of double-quoted strings to prevent the backslashes being interpreted by the shell:
So try this: