unix tr 查找和替换

发布于 2024-12-19 06:06:02 字数 452 浏览 2 评论 0原文

这是我在从网站 wget 的标准网页上使用的命令。

tr '<' '\n<' < index.html

然而它给了我换行符，但没有再次添加左侧的中断。例如

echo "<hello><world>" | tr '<' '\n<' | cat -e

$
hello>$
world>$

而不是

$
<hello>$
<world>$

What's bad?

原文

This is the command I'm using on a standard web page I wget from a web site.

tr '<' '\n<' < index.html

however it giving me newlines, but not adding the left broket in again.
e.g.

echo "<hello><world>" | tr '<' '\n<' | cat -e

returns

$
hello>$
world>$

instead of

$
<hello>$
<world>$

What's wrong?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

陌上青苔 2024-12-26 06:06:02

这是因为 tr 只进行逐个字符的替换（或删除）。

尝试使用 sed 来代替。

echo '<hello><world>' | sed -e 's/</\n&/g'

或者awk。

echo '<hello><world>' | awk '{gsub(/</,"\n<",$0)}1'

或者perl。

echo '<hello><world>' | perl -pe 's/</\n</g'

或者红宝石。

echo '<hello><world>' | ruby -pe '$_.gsub!(/</,"\n<")'

或者python。

echo '<hello><world>' \
| python -c 'for l in __import__("fileinput").input():print l.replace("<","\n<")'

That's because tr only does character-for-character substitution (or deletion).

Try sed instead.

echo '<hello><world>' | sed -e 's/</\n&/g'

Or awk.

echo '<hello><world>' | awk '{gsub(/</,"\n<",$0)}1'

Or perl.

echo '<hello><world>' | perl -pe 's/</\n</g'

Or ruby.

echo '<hello><world>' | ruby -pe '$_.gsub!(/</,"\n<")'

Or python.

echo '<hello><world>' \
| python -c 'for l in __import__("fileinput").input():print l.replace("<","\n<")'

回复收藏 0 原文

め可乐爱微笑 2024-12-26 06:06:02

如果您有 GNU grep，这可能对您有用：

grep -Po '<.*?>[^<]*' index.html

它应该穿过所有 HTML，但每个标签应该从行的开头开始，同一行上可能有非标签文本。

如果你只想要标签：

grep -Po '<.*?>' index.html

但是，你应该知道它是不是使用正则表达式解析 HTML 是个好主意。

If you have GNU grep, this may work for you:

grep -Po '<.*?>[^<]*' index.html

which should pass through all of the HTML, but each tag should start at the beginning of the line with possible non-tag text following on the same line.

If you want nothing but tags:

grep -Po '<.*?>' index.html

You should know, however, that it's not a good idea to parse HTML with regexes.

回复收藏 0 原文

想你的星星会说话 2024-12-26 06:06:02

放置换行符的顺序很重要。你也可以逃避“<”。

tr '<' '<\n' < index.html

也有效。

The order of where you put your newline is important. Also you can escape the "<".

tr '<' '<\n' < index.html

works as well.

回复收藏 0 原文

明媚如初 2024-12-26 06:06:02

这对你有用吗？

awk -F"><" -v OFS=">\n<" '{print $1,$2}'

[jaypal:~/Temp] echo "<hello><world>" | awk -F"><" -v OFS=">\n<" '{$1=$1}1';
<hello>
<world>

您可以在 awk {} 操作前面放置一个正则表达式 //（您希望发生这种情况的行）。

Does this work for you?

awk -F"><" -v OFS=">\n<" '{print $1,$2}'

[jaypal:~/Temp] echo "<hello><world>" | awk -F"><" -v OFS=">\n<" '{$1=$1}1';
<hello>
<world>

You can put a regex / / (lines you want this to happen for) in front of the awk {} action.

回复收藏 0 原文

~没有更多了~

关于作者

嘿嘿嘿

暂无简介

文章

25 人气

关注发私信

忆悲凉

文章 0 评论 0

关注

hgfg1645

文章 0 评论 0

关注

qq_qLPLYi

文章 0 评论 0

关注

戏舞

文章 0 评论 0

关注

殊姿

文章 0 评论 0

关注

﹂绝世的画

文章 0 评论 0

友情链接

文江博客

unix tr 查找和替换

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

unix tr 查找和替换

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。