setuptools:包数据文件夹位置
我使用 setuptools 来分发我的 python 包。现在我需要分发额外的数据文件。
根据我从 setuptools 文档中收集的信息,我需要将数据文件放在包目录中。但是,我宁愿将数据文件放在根目录的子目录中。
我想避免的:
/ #root
|- src/
| |- mypackage/
| | |- data/
| | | |- resource1
| | | |- [...]
| | |- __init__.py
| | |- [...]
|- setup.py
我想要的:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
如果不是必需的,我只是对拥有这么多子目录感到不舒服。我找不到原因,为什么我 /have/ 将文件放在包目录中。恕我直言,使用如此多的嵌套子目录也很麻烦。或者有什么充分的理由可以证明这种限制是合理的吗?
I use setuptools to distribute my python package. Now I need to distribute additional datafiles.
From what I've gathered fromt the setuptools documentation, I need to have my data files inside the package directory. However, I would rather have my datafiles inside a subdirectory in the root directory.
What I would like to avoid:
/ #root
|- src/
| |- mypackage/
| | |- data/
| | | |- resource1
| | | |- [...]
| | |- __init__.py
| | |- [...]
|- setup.py
What I would like to have instead:
/ #root
|- data/
| |- resource1
| |- [...]
|- src/
| |- mypackage/
| | |- __init__.py
| | |- [...]
|- setup.py
I just don't feel comfortable with having so many subdirectories, if it's not essential. I fail to find a reason, why I /have/ to put the files inside the package directory. It is also cumbersome to work with so many nested subdirectories IMHO. Or is there any good reason that would justify this restriction?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
选项 1:作为包数据安装
将数据文件放置在 Python 包的根目录中的主要优点
是它可以让您不必担心文件将存放在用户的何处
系统,可能是 Windows、Mac、Linux、某些移动平台或 Egg 内。你可以
始终找到相对于 Python 包根目录的
data
目录,无论它安装在何处或如何安装。例如,如果我有一个像这样的项目布局:
您可以向 __init__.py 添加一个函数来定位数据的绝对路径
file:
输出:
项目作为 Egg 安装后,
data
的路径将会更改,但代码不需要更改:选项 2:安装到固定位置
另一种方法是将数据放在 Python 包之外,然后
要么:
data
的位置,命令行参数或
如果您打算分发项目,那么这是不太理想的。如果您确实想要这样做,您可以通过传入元组列表来指定每组文件的目标,从而将
数据
安装在目标系统上的任何位置:更新:递归 grep Python 文件的 shell 函数示例:
Option 1: Install as package data
The main advantage of placing data files inside the root of your Python package
is that it lets you avoid worrying about where the files will live on a user's
system, which may be Windows, Mac, Linux, some mobile platform, or inside an Egg. You can
always find the directory
data
relative to your Python package root, no matter where or how it is installed.For example, if I have a project layout like so:
You can add a function to
__init__.py
to locate an absolute path to a datafile:
Outputs:
After the project is installed as an Egg the path to
data
will change, but the code doesn't need to change:Option 2: Install to fixed location
The alternative would be to place your data outside the Python package and then
either:
data
passed in via a configuration file,command line arguments or
This is far less desirable if you plan to distribute your project. If you really want to do this, you can install your
data
wherever you like on the target system by specifying the destination for each group of files by passing in a list of tuples:Updated: Example of a shell function to recursively grep Python files:
我认为我找到了一个很好的折衷方案,它允许您维护以下结构:
您应该将数据安装为 package_data,以避免 Samplebias 答案中描述的问题,但为了维护文件结构,您应该添加到 setup.py 中:
通过这种方式,我们“及时”创建适当的结构,并保持源代码树的组织。
要在代码中访问此类数据文件,您“只需”使用:
data = resource_filename(Requirement.parse("main_package"), 'mypackage/data')
我仍然不喜欢必须指定代码中的“mypackage”,因为数据可能与此模块无关,但我想这是一个很好的妥协。
I Think I found a good compromise which will allow you to mantain the following structure:
You should install data as package_data, to avoid the problems described in samplebias answer, but in order to mantain the file structure you should add to your setup.py:
This way we create the appropriate structure "just in time", and mantain our source tree organized.
To access such data files within your code, you 'simply' use:
data = resource_filename(Requirement.parse("main_package"), 'mypackage/data')
I still don't like having to specify 'mypackage' in the code, as the data could have nothing to do necessarally with this module, but i guess its a good compromise.
我可以使用 importlib_resources 或 importlib.resources (取决于 python 版本)。
https://importlib-resources.readthedocs.io/en/latest/using。 html
I could use
importlib_resources
orimportlib.resources
(depending on python version).https://importlib-resources.readthedocs.io/en/latest/using.html
我认为您基本上可以将任何内容作为参数 *data_files* 提供给 setup()。
I think that you can basically give anything as an argument *data_files* to setup().