与SED匹配WebVTT文件中的时间戳
我有以下可匹配和删除.webvtt字幕文件(YouTube的默认值)中的PCRE2 REGEX:
^[0-9].:[0-9].:[0-9].+$
这更改了:
00:00:00.126 --> 00:00:10.058
How are you today?
00:00:10.309 --> 00:00:19.272
Not bad, you?
00:00:19.559 --> 00:00:29.365
Been better.
我
How are you today?
Not bad, you?
Been better.
如何将此PCRE2 REGEX转换为惯用性)等效于sed
的正则味道?
I have the following PCRE2 regex that works to match and remove timestamp lines in a .webVTT subtitle file (the default for YouTube):
^[0-9].:[0-9].:[0-9].+$
This changes this:
00:00:00.126 --> 00:00:10.058
How are you today?
00:00:10.309 --> 00:00:19.272
Not bad, you?
00:00:19.559 --> 00:00:29.365
Been better.
To this:
How are you today?
Not bad, you?
Been better.
How would I convert this PCRE2 regex to an idiomatic (read: sane-looking) equivalent for sed
's flavour of regex?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用
sed
或,不匹配以整数结尾的行
Using your regex with
sed
Or, do not match lines that end with an integer
您的模式不是特定的PCRE2模式,仅使用SED,您必须逃脱
\+
将其作为1次或更多次的量词。在您使用点匹配任何字符(并查看示例数据)的位置上,也有一个数字。
您可以使模式更具体,并完全省略量词。如果图案匹配,只需防止线打印即可。
-n
防止SED!P
在模式不匹配输出的情况下打印行的默认打印
Your pattern is not a specific PCRE2 pattern, only using sed you have to escape the
\+
to make it a quantifier for 1 or more times.At the positions that you use a dot to match any character (and looking at the example data) there is a digit as well.
You could make the pattern a bit more specific, and omit the quantifier at all. Just prevent the line from printing if the pattern matches.
-n
prevents the default printing in sed!p
prints the line if the pattern does not matchOutput