删除
grep 输出中的标签

发布于 2024-10-25 17:43:11 字数 462 浏览 3 评论 0原文

我有一个 bash 脚本，它将在目录中的 .htm 或 .html 文件中查找电话号码（或者如果我想要的话，可以递归地向下）查找格式为 (ddd)ddd-dddd 或 ddd-ddd-dddd 的电话号码（其中d 代表数字）。

这是我的代码：

find ./ -maxdepth 1 -regex ".*\(html\|htm\)$" | xargs grep '\(([0-9]\{3\})\|[0-9]\{3\}\)[-]\?[0-9]\{3\}-[0-9]\{4\}'

输出是：

./dash_only_phone.htm:800-555-1212</p>
./paren_phone.htm:(800)555-1212</p>

我想知道如何更改 grep 命令以删除末尾的 html p 标签打印输出。

谢谢，

原文

I have I bash script that will find phones numbers inside .htm or .html files in a directory (or recursivly down if I want it) to find phone numbers in the format (ddd)ddd-dddd or ddd-ddd-dddd (Where d represents a digit).

This is my code:

find ./ -maxdepth 1 -regex ".*\(html\|htm\)$" | xargs grep '\(([0-9]\{3\})\|[0-9]\{3\}\)[-]\?[0-9]\{3\}-[0-9]\{4\}'

The output is:

./dash_only_phone.htm:800-555-1212</p>
./paren_phone.htm:(800)555-1212</p>

I was wondering how I would change the grep command to remove the html p tag printout at the end.

Thanks,

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

月下凄凉 2024-11-01 17:43:11

如果您的 grep 支持 Perl 兼容正则表达式，GNU 和 OS X grep 也支持：

grep -Po '(\([0-9]{3}\)|[0-9]{3})-?[0-9]{3}-[0-9]{4}(?=</p>)'

请注意转义中的更改（与 grep 类似或相同） -E）。

If your grep supports Perl Compatible Regular Expressions, as do GNU and OS X grep:

grep -Po '(\([0-9]{3}\)|[0-9]{3})-?[0-9]{3}-[0-9]{4}(?=</p>)'

Note the changes in escaping (which are similar to or the same as for grep -E).

回复收藏 0 原文

笑叹一世浮沉 2024-11-01 17:43:11

为什么不直接通过 sed 过滤器传递输出来删除它，如以下记录所示：

pax$ echo './dash_only_phone.htm:800-555-1212</p>' | sed 's?</p>$??'
./dash_only_phone.htm:800-555-1212

这将删除出现在一行的末尾。

Why not just pass the output through a sed filter to remove it, as in the following transcript:

pax$ echo './dash_only_phone.htm:800-555-1212</p>' | sed 's?</p>$??'
./dash_only_phone.htm:800-555-1212

This will get rid of any </p> sequences that appear at the end of a line.

回复收藏 0 原文

无可置疑 2024-11-01 17:43:11

您只需添加 -o 开关即可获取 IP

find ./ -maxdepth 1 -regex ".*\(html\|htm\)$" | xargs grep -o '\(([0-9]\{3\})\|[0-9]\{3\}\)[-]\?[0-9]\{3\}-[0-9]\{4\}'

You can just add the -o switch to get the IP

find ./ -maxdepth 1 -regex ".*\(html\|htm\)$" | xargs grep -o '\(([0-9]\{3\})\|[0-9]\{3\}\)[-]\?[0-9]\{3\}-[0-9]\{4\}'

回复收藏 0 原文

~没有更多了~

关于作者

两人的回忆

暂无简介

0 文章

0 评论

25 人气

关注发私信

lixs

文章 0 评论 0

关注

敷衍　

文章 0 评论 0

关注

盗梦空间

文章 0 评论 0

关注

tian

文章 0 评论 0

关注

13375331123

文章 0 评论 0

关注

你对谁都笑

文章 0 评论 0

友情链接

文江博客

删除
grep 输出中的标签

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

lixs

敷衍

盗梦空间

tian

13375331123

你对谁都笑

友情链接

删除grep 输出中的标签

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

lixs

敷衍

盗梦空间

tian

13375331123

你对谁都笑

友情链接

删除
grep 输出中的标签

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

敷衍