snakemake fasterq-dump包装器属性:'通配符'对象没有属性' compersion'

发布于 2025-02-07 16:23:31 字数 5319 浏览 4 评论 0原文

我正在尝试使用在我的snakemake工作流中,以下载配对 - 端fastq.gz文件。这是我的蛇:

# read a .txt file including many SRR* accession number
import pandas as pd
df = pd.read_csv('SraRunTable.txt', sep=',', header=0)

# append all accession number to a list
SAMPLES = []

for i in df['Run']:
    SAMPLES.append(i)

# snakemake workflow starts here
rule all:
    input:
        expand("/data/fastq/{sample}_1.fastq.gz", sample=SAMPLES)

rule get_fastq_pe_gz:
    output:
        # the wildcard name must be accession
        "/data/fastq/{sample}_1.fastq.gz",
        "/data/fastq/{sample}_2.fastq.gz",
    log:
        "/data/logs/{sample}.log"
    params:
        extra="--skip-technical"
    threads: 20
    wrapper:
        "v1.7.0/bio/sra-tools/fasterq-dump"

使用conda执行它后,snakemake -s fasterq-dump.snake-snake-cores 20 - use-conda,我收到了一个属性,我无法弄清楚。任何建议或解决方案都值得赞赏!

这是完整日志,包括错误消息:

Building DAG of jobs...
Creating conda environment https://github.com/snakemake/snakemake-wrappers/raw/v1.7.0/bio/sra-tools/fasterq-dump/environment.yaml...
Downloading and installing remote packages.
Environment for https://github.com/snakemake/snakemake-wrappers/raw/v1.7.0/bio/sra-tools/fasterq-dump/environment.yaml created (location: .snakemake/conda/fab035359fa42a09dfad78160e9b8543)
Using shell: /usr/bin/bash
Provided cores: 20
Rules claiming more threads will be scaled down.
Job stats:
job                count    min threads    max threads
---------------  -------  -------------  -------------
all                    1              1              1
get_fastq_pe_gz      422             20             20
total                423              1             20

Select jobs to execute...

[Wed Jun 15 17:10:30 2022]
rule get_fastq_pe_gz:
    output: /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_1.fastq.gz, /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_2.fastq.gz
    log: /data/scratch/yaochung/Khrameeva/logs/SRR8750458.log
    jobid: 62
    reason: Missing output files: /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_1.fastq.gz
    wildcards: sample=SRR8750458
    threads: 20
    resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/fab035359fa42a09dfad78160e9b8543
Activating conda environment: .snakemake/conda/fab035359fa42a09dfad78160e9b8543
Traceback (most recent call last):
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/scripts/tmp4ip6wnot.wrapper.py", line 45, in <module>
    shell(
  File "/home/yaochung41/anaconda3/envs/snakemake/lib/python3.10/site-packages/snakemake/shell.py", line 139, in __new__
    cmd = format(cmd, *args, stepout=2, **kwargs)
  File "/home/yaochung41/anaconda3/envs/snakemake/lib/python3.10/site-packages/snakemake/utils.py", line 430, in format
    return fmt.format(_pattern, *args, **variables)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 161, in format
    return self.vformat(format_string, args, kwargs)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 165, in vformat
    result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 205, in _vformat
    obj, arg_used = self.get_field(field_name, args, kwargs)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 276, in get_field
    obj = getattr(obj, i)
AttributeError: 'Wildcards' object has no attribute 'accession'
[Wed Jun 15 17:10:34 2022]
Error in rule get_fastq_pe_gz:
    jobid: 62
    output: /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_1.fastq.gz, /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_2.fastq.gz
    log: /data/scratch/yaochung/Khrameeva/logs/SRR8750458.log (check log file(s) for error message)
    conda-env: /data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543

RuleException:
CalledProcessError in line 25 of /data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/fasterq-dump.snake:
Command 'source /home/yaochung41/anaconda3/bin/activate '/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543'; set -euo pipefail;  python /data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/scripts/tmp4ip6wnot.wrapper.py' returned non-zero exit status 1.
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/fasterq-dump.snake", line 25, in __rule_get_fastq_pe_gz
  File "/home/yaochung41/anaconda3/envs/snakemake/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-06-15T170843.109776.snakemake.log

I am trying to use fasterq-dump wrapper in my snakemake workflow to download paired-end fastq.gz files. Here is my snakefile:

# read a .txt file including many SRR* accession number
import pandas as pd
df = pd.read_csv('SraRunTable.txt', sep=',', header=0)

# append all accession number to a list
SAMPLES = []

for i in df['Run']:
    SAMPLES.append(i)

# snakemake workflow starts here
rule all:
    input:
        expand("/data/fastq/{sample}_1.fastq.gz", sample=SAMPLES)

rule get_fastq_pe_gz:
    output:
        # the wildcard name must be accession
        "/data/fastq/{sample}_1.fastq.gz",
        "/data/fastq/{sample}_2.fastq.gz",
    log:
        "/data/logs/{sample}.log"
    params:
        extra="--skip-technical"
    threads: 20
    wrapper:
        "v1.7.0/bio/sra-tools/fasterq-dump"

After executing it using conda, snakemake -s fasterq-dump.snake --cores 20 --use-conda, I received an AttributeError which I cannot figure it out. Any suggestions or solutions are appreciated!

Here is the complete log including the error message:

Building DAG of jobs...
Creating conda environment https://github.com/snakemake/snakemake-wrappers/raw/v1.7.0/bio/sra-tools/fasterq-dump/environment.yaml...
Downloading and installing remote packages.
Environment for https://github.com/snakemake/snakemake-wrappers/raw/v1.7.0/bio/sra-tools/fasterq-dump/environment.yaml created (location: .snakemake/conda/fab035359fa42a09dfad78160e9b8543)
Using shell: /usr/bin/bash
Provided cores: 20
Rules claiming more threads will be scaled down.
Job stats:
job                count    min threads    max threads
---------------  -------  -------------  -------------
all                    1              1              1
get_fastq_pe_gz      422             20             20
total                423              1             20

Select jobs to execute...

[Wed Jun 15 17:10:30 2022]
rule get_fastq_pe_gz:
    output: /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_1.fastq.gz, /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_2.fastq.gz
    log: /data/scratch/yaochung/Khrameeva/logs/SRR8750458.log
    jobid: 62
    reason: Missing output files: /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_1.fastq.gz
    wildcards: sample=SRR8750458
    threads: 20
    resources: tmpdir=/tmp

Activating conda environment: .snakemake/conda/fab035359fa42a09dfad78160e9b8543
Activating conda environment: .snakemake/conda/fab035359fa42a09dfad78160e9b8543
Traceback (most recent call last):
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/scripts/tmp4ip6wnot.wrapper.py", line 45, in <module>
    shell(
  File "/home/yaochung41/anaconda3/envs/snakemake/lib/python3.10/site-packages/snakemake/shell.py", line 139, in __new__
    cmd = format(cmd, *args, stepout=2, **kwargs)
  File "/home/yaochung41/anaconda3/envs/snakemake/lib/python3.10/site-packages/snakemake/utils.py", line 430, in format
    return fmt.format(_pattern, *args, **variables)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 161, in format
    return self.vformat(format_string, args, kwargs)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 165, in vformat
    result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 205, in _vformat
    obj, arg_used = self.get_field(field_name, args, kwargs)
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543/lib/python3.10/string.py", line 276, in get_field
    obj = getattr(obj, i)
AttributeError: 'Wildcards' object has no attribute 'accession'
[Wed Jun 15 17:10:34 2022]
Error in rule get_fastq_pe_gz:
    jobid: 62
    output: /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_1.fastq.gz, /data/scratch/yaochung/Khrameeva/fastq/SRR8750458_2.fastq.gz
    log: /data/scratch/yaochung/Khrameeva/logs/SRR8750458.log (check log file(s) for error message)
    conda-env: /data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543

RuleException:
CalledProcessError in line 25 of /data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/fasterq-dump.snake:
Command 'source /home/yaochung41/anaconda3/bin/activate '/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/conda/fab035359fa42a09dfad78160e9b8543'; set -euo pipefail;  python /data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/.snakemake/scripts/tmp4ip6wnot.wrapper.py' returned non-zero exit status 1.
  File "/data/scratch/yaochung/TEKRABber_thesis/pipelines/fasterq-dump/fasterq-dump.snake", line 25, in __rule_get_fastq_pe_gz
  File "/home/yaochung41/anaconda3/envs/snakemake/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-06-15T170843.109776.snakemake.log

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甜妞爱困 2025-02-14 16:23:32

If you look at the code the shell command expects the files to use the wildcard accession instead of sample as in your rule. You should be able to rename sample to accession in your filenames and have it work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文