处理配置选项中的环境变量

发布于 2025-01-16 17:07:40 字数 716 浏览 2 评论 0原文

我有一个带有如下配置选项的snakemake命令行:

snakemake --config \
    f1=$PWD/file1.txt \
    f2=$PWD/file2.txt \
    f3=/path/to/file3.txt \
    ...more key-value pairs \
    --directory /path/to/output/dir

file1.txtfile2.txt预计与snakefile,file3.txt位于同一目录中 在其他地方。我需要文件的路径是绝对路径,因此需要 $PWD 变量,以便 Snakemake 可以在移动到 /path/to/output/dir 后找到文件。

因为我开始有多个配置选项,所以我想将所有 --config 项移动到单独的 yaml 配置文件中。问题是:如何将变量 $PWD 传输到配置文件?

我可以在 yaml 文件中有一个虚拟字符串,指示该字符串将被 Snakefile 所在的目录替换(例如 f1:/file1.txt),但我觉得它是尴尬的。还有更好的想法吗?也许我应该重新考虑如何将文件 fileX.txt 传递给 Snakemake...

I have snakemake command line with configuration options like this:

snakemake --config \
    f1=$PWD/file1.txt \
    f2=$PWD/file2.txt \
    f3=/path/to/file3.txt \
    ...more key-value pairs \
    --directory /path/to/output/dir

file1.txt and file2.txt are expected to be in the same directory as the snakefile, file3.txt is somewhere else. I need the paths to files to be absolute, hence the $PWD variable, so Snakemake can find the files after moving to /path/to/output/dir.

Because I start having several configuration options, I would like to move all the --config items to a separate yaml configuration file. The problem is: How do I transfer the variable $PWD to a configuration file?

I could have a dummy string in the yaml file indicating that that string is to be replaced by the directory where the Snakefile is (e.g. f1: <replace me>/file1.txt) but I feel it's awkward. Any better ideas? It may be that I should rethink how the files fileX.txt are passed to snakemake...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

迎风吟唱 2025-01-23 17:07:41

一种选择是使用外部模块 intake 来处理环境变量一体化。有一个类似的答案,但该问题的更具体示例如下。

一个 yaml 文件,它遵循 intake 所期望的语法,一个名为 sources 的字段,其中包含一个嵌套条目列表,这些条目至少指定一个(可能是本地)可以访问文件的 url:

# config.yml
sources:
  file1:
    args:
      url: "{{env(PWD)}}/file1.txt"
  file2:
    args:
      url: "{{env(PWD)}}/file2.txt"

Snakefile 内,相关代码为:

import intake
cat = intake.open_catalog('config.yml')
f1 = cat['file1'].urlpath
f2 = cat['file2'].urlpath

请注意,对于不太详细的 yaml 文件,intake< /code> 提供参数化语法,请参阅文档或此示例

One option is to use an external module, intake, to handle the environmental variable integration. There is a similar answer, but a more specific example for this question is as follow.

A yaml file which follows the syntax expected by intake, a field called sources that contains a list of nested entries that specify at the very least a (possibly local) url at which the file can be access:

# config.yml
sources:
  file1:
    args:
      url: "{{env(PWD)}}/file1.txt"
  file2:
    args:
      url: "{{env(PWD)}}/file2.txt"

Inside the Snakefile, the relevant code would be:

import intake
cat = intake.open_catalog('config.yml')
f1 = cat['file1'].urlpath
f2 = cat['file2'].urlpath

Note that for less verbose yaml files, intake provides syntax for parameterization, see the docs or this example.

思念满溢 2025-01-23 17:07:41

您可以使用 workflow.basedir 访问 Snakefile 所在的目录 - 您也许可以在配置文件中指定相对路径,然后在 Snakefile 中定义绝对路径,例如:

file1 = pathlib.Path(workflow.basedir) / config["f1"]
file2 = pathlib.Path(workflow.basedir) / config["f2"]

You can access the directory the Snakefile lives in with workflow.basedir - you might be able to get away with specifying the relative path in the config file and then defining the absolute path in your Snakefile e.g. as

file1 = pathlib.Path(workflow.basedir) / config["f1"]
file2 = pathlib.Path(workflow.basedir) / config["f2"]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文