SRT 字幕的 Java API

发布于 2024-10-18 11:41:50 字数 1539 浏览 10 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

画离情绘悲伤 2024-10-25 11:41:50

实际的SRT解析是通过Java能够操作的正则表达式来执行的。

实际的正则表达式是:

protected static final String nl = "\\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile("(?s)(\\d+)" + sp + nl + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "-->"+ sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "(X1:\\d.*?)??" + nl + "(.*?)" + nl + nl);

第 2、3、4 和 5 组是开始时间
第6、7、8、9组为结束时间
第11组是字幕文本

The actual SRT parsing is performed through regular expressions, which Java is able to manipulate.

The actual regexp is:

protected static final String nl = "\\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile("(?s)(\\d+)" + sp + nl + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "-->"+ sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp + "(X1:\\d.*?)??" + nl + "(.*?)" + nl + nl);

group 2, 3, 4, and 5 is start time
group 6, 7, 8, and 9 is finish time
group 11 is subtitle text

盗心人 2024-10-25 11:41:50

我已经制作了一个 java 逻辑,用它来解析和读取不同的字幕格式,其中包括流行的 srt:您可以在我的 Git 存储库中找到根据 MIT 开源许可证(免费用于任何用途)授权的代码:

https://github.com/JDaren/subtitleConverter

您可能只需要基本类和 SRTFormat 类,这样您就可以可以从 InputStream 读取 srt 文件,或者在完成编辑后获取完整的 String[] 文件。

如果您确实觉得这有用或者我可以帮助您,请与我联系。

PS:(其他支持的格式,部分或全部为 .ASS .SSA .STL .SCC 和 .XML (来自 W3C 的 TTAF-DFXP 也称为 TTML 1.0)

编辑:

您可以在 www.subtitleconverter.net

I have produced a java logic with which to parse and read different subtitle formats, among them is the popular srt: you can find the code licensed under MIT open source license (free to use for whatever) in my GiT repository:

https://github.com/JDaren/subtitleConverter

You probably just need the basic classes and the SRTFormat class, and with that you can read srt files from an InputStream or get full String[] files once you've finished editing them.

If you do find this useful or I can help you with anything please contact me.

PS: (other supported formats, either partially or fully are .ASS .SSA .STL .SCC and .XML (from W3C's TTAF-DFXP also known as TTML 1.0)

EDIT:

you can find the logic at work in www.subtitleconverter.net

烟织青萝梦 2024-10-25 11:41:50

实际上,@Panayotis 修改后的支持多行字幕文本的正则表达式是这样的:

protected static final String nl = "\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile(
                    "(\\d+)" + sp + nl
                    + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "-->" + sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "(X1:\\d.*?)??" + nl + "([^\\|]*?)" + nl + nl);

([^\\|]*?) 替换为任何概率较小的字符作为副标题文本。我目前使用的是“|”字符否定规则。

Actually the modified regex from @Panayotis that supports multi-line subtitle text is like this:

protected static final String nl = "\\n";
protected static final String sp = "[ \\t]*";
Pattern.compile(
                    "(\\d+)" + sp + nl
                    + "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "-->" + sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
                    + "(X1:\\d.*?)??" + nl + "([^\\|]*?)" + nl + nl);

Replace ([^\\|]*?) with any character which have less probability to come as subtitle text. I have currently used "|" character negation rule.

微暖i 2024-10-25 11:41:50

还有另一个基本(开源)API 可以处理 SRT 和 ASS 字幕这里

解析 SRT :

File file = Paths.get("subtitle.srt").toFile();
SRTSub subtitle = new SRTParser().parse(file);

There is another basic (and open source) API that can deal with SRT and ASS subtitle here

Parsing SRT :

File file = Paths.get("subtitle.srt").toFile();
SRTSub subtitle = new SRTParser().parse(file);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文