snakemake：指定由glob_wildcard获得的文件

发布于 2025-02-04 22:45:03 字数 638 浏览 5 评论 0原文

如何指定glob_wildcards获得的文件？

假设我有sample1.txt，sample2.txt，sample3.txt和sample4.txt在同一目录中。

以下代码只是一个示例：

FILES = glob_wildcards("data/{sample}.txt")
SAMPLES = FILES.sample

rule all:
    input:
        expand("{sample}txt", sample=SAMPLES),
        "concat.txt"

rule concat:
    input:
        SAMPLES[0],
        SAMPLES[1]
    output:
        "concat.txt"
    shell:
        "cat {input[0]} {input[1]} > {output}"

当我要concat sample1.txt和sample2.txt时，如rule concat中所示，如何指定这些文件？编写样本[0]和样本[1]是正确的吗？

原文

How can I specify the file obtained by glob_wildcards?

Suppose I have sample1.txt, sample2.txt, sample3.txt, and sample4.txt are in the same directory.

The following code is just an example:

FILES = glob_wildcards("data/{sample}.txt")
SAMPLES = FILES.sample

rule all:
    input:
        expand("{sample}txt", sample=SAMPLES),
        "concat.txt"

rule concat:
    input:
        SAMPLES[0],
        SAMPLES[1]
    output:
        "concat.txt"
    shell:
        "cat {input[0]} {input[1]} > {output}"

When I want to concat sample1.txt and sample2.txt as shown in rule concat, how can I specify those files? Is it correct to write SAMPLES[0] and SAMPLES[1]?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

因为看清所以看轻 2025-02-11 22:45:04

您几乎是正确的，除非请记住，glob_wildcards将仅返回通配符值，因此，当在规则中引用文件时，您需要将这些通配符值提供为特定的文件路径。

为了保持一致性，您可以继续使用expliving（）：

file_pattern = 'data/{sample}.txt'
SAMPLES, = glob_wildcards(file_pattern)

rule all:
    input:
        expand(file_pattern, sample=SAMPLES),
        "concat.txt"

rule concat:
    input:
        expand(file_pattern, sample=SAMPLES[:2]),
    output:
        "concat.txt"
    shell:
        "cat {input} > {output}"

You are almost correct, except keep in mind that glob_wildcards will return only the wildcard values, so when referencing files in rules you will need to provide these wildcard values into the specific file path.

For consistency, you can continue using expand():

file_pattern = 'data/{sample}.txt'
SAMPLES, = glob_wildcards(file_pattern)

rule all:
    input:
        expand(file_pattern, sample=SAMPLES),
        "concat.txt"

rule concat:
    input:
        expand(file_pattern, sample=SAMPLES[:2]),
    output:
        "concat.txt"
    shell:
        "cat {input} > {output}"

回复收藏 0 原文

~没有更多了~