如何使用FFMPEG从MP3中提取歌词(USLT帧)?

发布于 2025-01-25 15:05:35 字数 1822 浏览 3 评论 0 原文

我正在使用mp3tag的“工具”功能来批量运行Windows中的FFMPEG,以便批量提取嵌入式歌词内容( id3v2 tag )来自mp3文件,我知道使用ffmpeg我可以做类似:

-i "%_path%" -f ffmetadata "%_folderpath%\%_filename%.txt"

“%_path%” = mp3文件的完整路径

“%_folderpath %% _ filename%.txt” = path和filename导出的TXT文件。

上面的命令从mp3文件中提取所有元数据,并将它们导出到txt文件中,例如:(

;FFMETADATA1
album=name of the album
artist=name of the artist
title=name of the title
lyrics-eng=[00:01.23]line1 of lyrics
\
[00:04.56]line2 of lyrics
\
[00:07.89]line3 of lyrics
\
[01:03.12]3rd last line of lyrics
\
[02:04.34]2nd last line of lyrics
\
[03:05.67]Last line of lyrics
\

date=2020
encoder=Lavf59.23.100

原始歌词使用简单的lrc格式,每行都有时间戳,某些行仅包含带有空歌词的时间戳)

(可能(或可能没有)其他元数据(例如,href = a href =” ://i.sstatic.net/g55of.png“ rel =” nofollow noreferrer“> date and date and conder在上面的示例)之后的歌词部分)如上所述

,backslash“ \”在原始歌词中的存在)在每首歌词之后添加以某种方式添加, cr(crriagereturn)和lf(linefeed)和lf(lineFeed) )如记事本++ 所示(原始歌词使用 crlf 作为eol字符)。

因此,我如何修改给定命令行 唯一的歌词part( exlude exlude exlud exlude预期文本文件内容的一个示例如下:

[00:01.23]line1 of lyrics
[00:04.56]line2 of lyrics
[00:07.89]line3 of lyrics
[01:03.12]3rd last line of lyrics
[02:04.34]2nd last line of lyrics
[03:05.67]Last line of lyrics

带有来自歌词的原始eol字符,例如 crlf

I'm using Mp3tag's "Tools" feature to batch run FFmpeg in Windows, in order to batch extract the embedded lyrics content (USLT frame of ID3v2 tag) from MP3 files, I know with FFmpeg I can do something like:

-i "%_path%" -f ffmetadata "%_folderpath%\%_filename%.txt"

"%_path%" = full path of the MP3 file

"%_folderpath%%_filename%.txt" = path and filename of the exported txt file.

The command above extracts all the metadata from MP3 file and export them into a txt file with the following cotent for example:

;FFMETADATA1
album=name of the album
artist=name of the artist
title=name of the title
lyrics-eng=[00:01.23]line1 of lyrics
\
[00:04.56]line2 of lyrics
\
[00:07.89]line3 of lyrics
\
[01:03.12]3rd last line of lyrics
\
[02:04.34]2nd last line of lyrics
\
[03:05.67]Last line of lyrics
\

date=2020
encoder=Lavf59.23.100

(the original lyrics uses Simple LRC format with timestamps in each line, certain lines contain only the timestamp with empty lyrics)

(There might (or might not) be additional metadata (e.g. date and encoder in the example above) following the lyrics part)

As seen above, the backslash "\" (which is not present in the original lyrics) is somehow added after each line of lyrics, between CR (CarriageReturn) and LF (LineFeed) as seen in Notepad++ (the original lyrics use CRLF as EOL characters).

So how do I modify the given command line to export only the lyrics part (exluding all other metadata and the extra backslash "\"), an example of the expected text file content is shown below:

[00:01.23]line1 of lyrics
[00:04.56]line2 of lyrics
[00:07.89]line3 of lyrics
[01:03.12]3rd last line of lyrics
[02:04.34]2nd last line of lyrics
[03:05.67]Last line of lyrics

with the original EOL characters from lyrics such as CRLF

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

浅笑轻吟梦一曲 2025-02-01 15:05:35
  1. 我建议您通过搜索 \ s*\\ s*并用 \ n 替换它们来删除所有不需要的 \ 。 =“ https://regex101.com/r/pebwwm/1”
  2. (在此处进行测试:(? (在此处测试: https://regex101.com/r/8ad6ki/1
  1. I suggest that you remove all the unwanted \ by searching for \s*\\\s* and replacing them with \n. (Test here: https://regex101.com/r/PEBWwm/1)
  2. Then search for (?<=lyrics-eng=)(?:[\w ]+\s)+ to capture all the lyrics without \ between them. (Test here: https://regex101.com/r/8ad6kI/1)
梦明 2025-02-01 15:05:35

这增加了@anothergatsby答案:

afaik,ffmpeg本身没有能力仅返回特定的元数据标签,更不用说修改标签值了。您唯一的选择是将ffmpeg输出输送到反正力的命令(例如, sed> sed linux命令,在python/powershell等脚本中处理正则脚本)。

例如:

ffmpeg -i "%_path%" -f ffmetadata - | sed -n {regex_expr}  "%_folderpath%\%_filename%.txt"

基于文本输出路径,您似乎在Windows Env中。如果我是,您将学习PowerShell脚本及其正则支持。

This adds to @anothergatsby answer:

AFAIK, FFmpeg itself does not have a capability to return only a particular metadata tag, much less modifying the tag values. Your only option is to pipe the FFmpeg output to a regex-capable command (e.g., sed Linux command, handling the regex in a Python/PowerShell etc. script).

For example:

ffmpeg -i "%_path%" -f ffmetadata - | sed -n {regex_expr}  "%_folderpath%\%_filename%.txt"

Based on the text output path, it appears that you are in Windows env. If I were you I'd learn PowerShell scripting and its regex support.

只等公子 2025-02-01 15:05:35

您正在寻找的正则是这样:

(\[[0-9].*)

我不知道如何在提取歌词或使用命令提示符时进行编辑。
如果您找不到更好的方法并了解Python,则可以创建一个带有以下代码的Python脚本,将其放入包含的文件夹中,您要编辑和运行的文件。

import re
import os


def main():

    for file in os.listdir():
        with open(file, "r+") as f:
            lyrics = re.findall(r"(\[[0-9].*)", f.read())
            f.truncate(0)
            f.seek(0)
            for lyric in lyrics:
                f.write(lyric + "\n")


if __name__ == "__main__":
    main()

The regex you're looking for is this:

(\[[0-9].*)

I've no clue about how to do the editing while extracting the lyrics or with the command prompt in anyway.
If you can't find a better way and know python a bit, you can create a python script with the below code put it inside a folder that contains only the files you want to edit and run.

import re
import os


def main():

    for file in os.listdir():
        with open(file, "r+") as f:
            lyrics = re.findall(r"(\[[0-9].*)", f.read())
            f.truncate(0)
            f.seek(0)
            for lyric in lyrics:
                f.write(lyric + "\n")


if __name__ == "__main__":
    main()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文