用于从网页中提取文本并修剪的 Shell 脚本
我不知道如何诚实地修剪文本。
到目前为止我所拥有的:
wget --output-document=- http://www.geupdate.com 2>/dev/null \
| grep last \
输出:
<li><b><img src='http://www.geupdate.com/img/arrow-tail.png' align='left'>Time since last update</b>: <br />0 day, 19 hours, 23 min, 36 sec</li><li><b><img src='http://www.geupdate.com/img/ledlightblue.png' align='left'>An Update to occur within:</b> (<a href='http://www.geupdate.com/update-prediction/'><font size='-2'>?</font></a>) <br />0 day, 21 hours, 56 min, 30 sec</li> </ul>
我实际上想要删除的是:
0 day, 19 hours, 23 min, 36 sec
如果有人能告诉我如何写一些东西或者如果它那么简单的话,那就太好了!
当我运行这个:
wget --output-document=- http://www.geupdate.com 2>/dev/null \
| grep last \
| grep -o '[[:digit:]]* day.* sec'
我得到这个:
0 day, 19 hours, 43 min, 16 sec</li><li><b><img src='http://www.geupdate.com/img/ledlightblue.png' align='left'>An Update to occur within:</b> (<a href='http://www.geupdate.com/update-prediction/'><font size='-2'>?</font></a>) <br />0 day, 21 hours, 36 min, 50 sec
I don't know how to trim text honestly.
What I have so far:
wget --output-document=- http://www.geupdate.com 2>/dev/null \
| grep last \
Outputs:
<li><b><img src='http://www.geupdate.com/img/arrow-tail.png' align='left'>Time since last update</b>: <br />0 day, 19 hours, 23 min, 36 sec</li><li><b><img src='http://www.geupdate.com/img/ledlightblue.png' align='left'>An Update to occur within:</b> (<a href='http://www.geupdate.com/update-prediction/'><font size='-2'>?</font></a>) <br />0 day, 21 hours, 56 min, 30 sec</li> </ul>
What I actually want to trim out of this is:
0 day, 19 hours, 23 min, 36 sec
If anyone can tell me how write something or write it if it's that simple, that would be nice!
When I run this:
wget --output-document=- http://www.geupdate.com 2>/dev/null \
| grep last \
| grep -o '[[:digit:]]* day.* sec'
I get this:
0 day, 19 hours, 43 min, 16 sec</li><li><b><img src='http://www.geupdate.com/img/ledlightblue.png' align='left'>An Update to occur within:</b> (<a href='http://www.geupdate.com/update-prediction/'><font size='-2'>?</font></a>) <br />0 day, 21 hours, 36 min, 50 sec
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)