从 torrent 中读取文件集

发布于 2024-07-10 20:38:36 字数 197 浏览 12 评论 0原文

我想(快速)将程序/脚本放在一起以从 .torrent 文件中读取文件集。 然后我想使用该集删除特定目录中不属于 torrent 的任何文件。

关于从 .torrent 文件读取此索引的便捷库有什么建议吗? 虽然我不反对它,但我不想为了这个简单的目的而深入研究 BitTorrent 规范并从头开始编写大量代码。

我对语言没有偏好。

I want to (quickly) put a program/script together to read the fileset from a .torrent file. I want to then use that set to delete any files from a specific directory that do not belong to the torrent.

Any recommendations on a handy library for reading this index from the .torrent file? Whilst I don't object to it, I don't want to be digging deep into the bittorrent spec and rolling a load of code from scratch for this simple purpose.

I have no preference on language.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

神经大条 2024-07-17 20:38:36

我会使用 rasterbar 的 libtorrent,它是一个小型且快速的 C++ 库。
要迭代文件,您可以使用 torrent_info 类 (begin_files ()、end_files())。

libtorrent 还有一个 python 接口

import libtorrent
info = libtorrent.torrent_info('test.torrent')
for f in info.files():
    print "%s - %s" % (f.path, f.size)

I would use rasterbar's libtorrent which is a small and fast C++ library.
To iterate over the files you could use the torrent_info class (begin_files(), end_files()).

There's also a python interface for libtorrent:

import libtorrent
info = libtorrent.torrent_info('test.torrent')
for f in info.files():
    print "%s - %s" % (f.path, f.size)
孤独岁月 2024-07-17 20:38:36

Effbot 已解答您的问题。 以下是从 .torrent 文件读取文件列表的完整代码(Python 2.4+):

import re

def tokenize(text, match=re.compile("([idel])|(\d+):|(-?\d+)").match):
    i = 0
    while i < len(text):
        m = match(text, i)
        s = m.group(m.lastindex)
        i = m.end()
        if m.lastindex == 2:
            yield "s"
            yield text[i:i+int(s)]
            i = i + int(s)
        else:
            yield s

def decode_item(next, token):
    if token == "i":
        # integer: "i" value "e"
        data = int(next())
        if next() != "e":
            raise ValueError
    elif token == "s":
        # string: "s" value (virtual tokens)
        data = next()
    elif token == "l" or token == "d":
        # container: "l" (or "d") values "e"
        data = []
        tok = next()
        while tok != "e":
            data.append(decode_item(next, tok))
            tok = next()
        if token == "d":
            data = dict(zip(data[0::2], data[1::2]))
    else:
        raise ValueError
    return data

def decode(text):
    try:
        src = tokenize(text)
        data = decode_item(src.next, src.next())
        for token in src: # look for more tokens
            raise SyntaxError("trailing junk")
    except (AttributeError, ValueError, StopIteration):
        raise SyntaxError("syntax error")
    return data

if __name__ == "__main__":
    data = open("test.torrent", "rb").read()
    torrent = decode(data)
    for file in torrent["info"]["files"]:
        print "%r - %d bytes" % ("/".join(file["path"]), file["length"])

Effbot has your question answered. Here is the complete code to read the list of files from .torrent file (Python 2.4+):

import re

def tokenize(text, match=re.compile("([idel])|(\d+):|(-?\d+)").match):
    i = 0
    while i < len(text):
        m = match(text, i)
        s = m.group(m.lastindex)
        i = m.end()
        if m.lastindex == 2:
            yield "s"
            yield text[i:i+int(s)]
            i = i + int(s)
        else:
            yield s

def decode_item(next, token):
    if token == "i":
        # integer: "i" value "e"
        data = int(next())
        if next() != "e":
            raise ValueError
    elif token == "s":
        # string: "s" value (virtual tokens)
        data = next()
    elif token == "l" or token == "d":
        # container: "l" (or "d") values "e"
        data = []
        tok = next()
        while tok != "e":
            data.append(decode_item(next, tok))
            tok = next()
        if token == "d":
            data = dict(zip(data[0::2], data[1::2]))
    else:
        raise ValueError
    return data

def decode(text):
    try:
        src = tokenize(text)
        data = decode_item(src.next, src.next())
        for token in src: # look for more tokens
            raise SyntaxError("trailing junk")
    except (AttributeError, ValueError, StopIteration):
        raise SyntaxError("syntax error")
    return data

if __name__ == "__main__":
    data = open("test.torrent", "rb").read()
    torrent = decode(data)
    for file in torrent["info"]["files"]:
        print "%r - %d bytes" % ("/".join(file["path"]), file["length"])
灯下孤影 2024-07-17 20:38:36

这是康斯坦丁上面的答案中的代码,稍作修改以处理 torrent 文件名中的 Unicode 字符和 torrent 信息中的文件集文件名:

import re

def tokenize(text, match=re.compile("([idel])|(\d+):|(-?\d+)").match):
    i = 0
    while i < len(text):
        m = match(text, i)
        s = m.group(m.lastindex)
        i = m.end()
        if m.lastindex == 2:
            yield "s"
            yield text[i:i+int(s)]
            i = i + int(s)
        else:
            yield s

def decode_item(next, token):
    if token == "i":
        # integer: "i" value "e"
        data = int(next())
        if next() != "e":
            raise ValueError
    elif token == "s":
        # string: "s" value (virtual tokens)
        data = next()
    elif token == "l" or token == "d":
        # container: "l" (or "d") values "e"
        data = []
        tok = next()
        while tok != "e":
            data.append(decode_item(next, tok))
            tok = next()
        if token == "d":
            data = dict(zip(data[0::2], data[1::2]))
    else:
        raise ValueError
    return data

def decode(text):
    try:
        src = tokenize(text)
        data = decode_item(src.next, src.next())
        for token in src: # look for more tokens
            raise SyntaxError("trailing junk")
    except (AttributeError, ValueError, StopIteration):
        raise SyntaxError("syntax error")
    return data

n = 0
if __name__ == "__main__":
    data = open("C:\\Torrents\\test.torrent", "rb").read()
    torrent = decode(data)
    for file in torrent["info"]["files"]:
        n = n + 1
        filenamepath = file["path"]     
        print str(n) + " -- " + ', '.join(map(str, filenamepath))
        fname = ', '.join(map(str, filenamepath))

        print fname + " -- " + str(file["length"])

Here's the code from Constantine's answer above, slightly modified to handle Unicode characters in torrent filenames and fileset filenames in torrent info:

import re

def tokenize(text, match=re.compile("([idel])|(\d+):|(-?\d+)").match):
    i = 0
    while i < len(text):
        m = match(text, i)
        s = m.group(m.lastindex)
        i = m.end()
        if m.lastindex == 2:
            yield "s"
            yield text[i:i+int(s)]
            i = i + int(s)
        else:
            yield s

def decode_item(next, token):
    if token == "i":
        # integer: "i" value "e"
        data = int(next())
        if next() != "e":
            raise ValueError
    elif token == "s":
        # string: "s" value (virtual tokens)
        data = next()
    elif token == "l" or token == "d":
        # container: "l" (or "d") values "e"
        data = []
        tok = next()
        while tok != "e":
            data.append(decode_item(next, tok))
            tok = next()
        if token == "d":
            data = dict(zip(data[0::2], data[1::2]))
    else:
        raise ValueError
    return data

def decode(text):
    try:
        src = tokenize(text)
        data = decode_item(src.next, src.next())
        for token in src: # look for more tokens
            raise SyntaxError("trailing junk")
    except (AttributeError, ValueError, StopIteration):
        raise SyntaxError("syntax error")
    return data

n = 0
if __name__ == "__main__":
    data = open("C:\\Torrents\\test.torrent", "rb").read()
    torrent = decode(data)
    for file in torrent["info"]["files"]:
        n = n + 1
        filenamepath = file["path"]     
        print str(n) + " -- " + ', '.join(map(str, filenamepath))
        fname = ', '.join(map(str, filenamepath))

        print fname + " -- " + str(file["length"])
一生独一 2024-07-17 20:38:36

来自原始主线 BitTorrent 5.x 客户端的 bencode.py (http:// /download.bittorrent.com/dl/BitTorrent-5.2.2.tar.gz)将为您提供几乎 Python 中的参考实现。

它对 BTL 包有导入依赖,但很容易删除。 然后您可以查看 bencode.bdecode(filecontent)['info']['files']。

bencode.py from the original Mainline BitTorrent 5.x client (http://download.bittorrent.com/dl/BitTorrent-5.2.2.tar.gz) would give you pretty much the reference implementation in Python.

It has an import dependency on the BTL package but that's trivially easy to remove. You'd then look at bencode.bdecode(filecontent)['info']['files'].

明媚殇 2024-07-17 20:38:36

扩展上述想法,我做了以下操作:

~> cd ~/bin

~/bin> ls torrent*
torrent-parse.py  torrent-parse.sh

~/bin> cat torrent-parse.py
# torrent-parse.py
import sys
import libtorrent

# get the input torrent file
if (len(sys.argv) > 1):
    torrent = sys.argv[1]
else:
    print "Missing param: torrent filename"
    sys.exit()
# get names of files in the torrent file
info = libtorrent.torrent_info(torrent);
for f in info.files():
    print "%s - %s" % (f.path, f.size)

~/bin> cat torrent-parse.sh
#!/bin/bash
if [ $# -lt 1 ]; then
  echo "Missing param: torrent filename"
  exit 0
fi

python torrent-parse.py "$*"

您需要适当地设置权限以使 shell 脚本可执行:

~/bin> chmod a+x torrent-parse.sh

希望这对某人有帮助:)

Expanding on the ideas above, I did the following:

~> cd ~/bin

~/bin> ls torrent*
torrent-parse.py  torrent-parse.sh

~/bin> cat torrent-parse.py
# torrent-parse.py
import sys
import libtorrent

# get the input torrent file
if (len(sys.argv) > 1):
    torrent = sys.argv[1]
else:
    print "Missing param: torrent filename"
    sys.exit()
# get names of files in the torrent file
info = libtorrent.torrent_info(torrent);
for f in info.files():
    print "%s - %s" % (f.path, f.size)

~/bin> cat torrent-parse.sh
#!/bin/bash
if [ $# -lt 1 ]; then
  echo "Missing param: torrent filename"
  exit 0
fi

python torrent-parse.py "$*"

You'll want to set permissions appropriately to make the shell script executable:

~/bin> chmod a+x torrent-parse.sh

Hope this helps someone :)

还在原地等你 2024-07-17 20:38:36

这是上面 Alix 的代码,修改为在 python 3.x 下运行

import re

def tokenize(text, match=re.compile(b"([idel])|(\d+):|(-?\d+)").match):
    i = 0
    while i < len(text):
        m = match(text, i)
        s = m.group(m.lastindex)
        i = m.end()
        if m.lastindex == 2:
            yield "s"
            yield text[i:i+int(s)].decode("utf-8", "ignore")
            i = i + int(s)
        else:
            yield s.decode("utf-8", "ignore")

def decode_item(torrent, token):
    if token == "i":
        # integer: "i" value "e"
        data = int(next(torrent))
        if next(torrent) != "e":
            raise ValueError
    elif token == "s":
        data = next(torrent)
    elif token == "l" or token == "d":
        # container: "l" (or "d") values "e"
        data = []
        tok = next(torrent)
        while tok != "e":
            data.append(decode_item(torrent, tok))
            tok = next(torrent)
        if token == "d":
            data = dict(zip(data[0::2], data[1::2]))
    else:
        raise ValueError
    return data

def decode(text):
    try:
        src = tokenize(text)
        token=next(src)
        data = decode_item(src, token)
        for token in src: # look for more tokens
            raise SyntaxError("trailing junk")
    except (AttributeError, ValueError, StopIteration):
        raise SyntaxError("syntax error")
    return data

n = 0
if __name__ == "__main__":
    data = open("test.torrent", "rb").read()
    torrent = decode(data)
    for file in torrent["info"]["files"]:
        n = n + 1
        filenamepath = file["path"]
        print(str(n) + " -- " + ', '.join(map(str, filenamepath)))
        fname = ', '.join(map(str, filenamepath))

        print(fname + " -- " + str(file["length"]))

Here is the code from Alix above, modified to run under python 3.x

import re

def tokenize(text, match=re.compile(b"([idel])|(\d+):|(-?\d+)").match):
    i = 0
    while i < len(text):
        m = match(text, i)
        s = m.group(m.lastindex)
        i = m.end()
        if m.lastindex == 2:
            yield "s"
            yield text[i:i+int(s)].decode("utf-8", "ignore")
            i = i + int(s)
        else:
            yield s.decode("utf-8", "ignore")

def decode_item(torrent, token):
    if token == "i":
        # integer: "i" value "e"
        data = int(next(torrent))
        if next(torrent) != "e":
            raise ValueError
    elif token == "s":
        data = next(torrent)
    elif token == "l" or token == "d":
        # container: "l" (or "d") values "e"
        data = []
        tok = next(torrent)
        while tok != "e":
            data.append(decode_item(torrent, tok))
            tok = next(torrent)
        if token == "d":
            data = dict(zip(data[0::2], data[1::2]))
    else:
        raise ValueError
    return data

def decode(text):
    try:
        src = tokenize(text)
        token=next(src)
        data = decode_item(src, token)
        for token in src: # look for more tokens
            raise SyntaxError("trailing junk")
    except (AttributeError, ValueError, StopIteration):
        raise SyntaxError("syntax error")
    return data

n = 0
if __name__ == "__main__":
    data = open("test.torrent", "rb").read()
    torrent = decode(data)
    for file in torrent["info"]["files"]:
        n = n + 1
        filenamepath = file["path"]
        print(str(n) + " -- " + ', '.join(map(str, filenamepath)))
        fname = ', '.join(map(str, filenamepath))

        print(fname + " -- " + str(file["length"]))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文