通过 pip 部署时,我们如何找到 PySpark 的所有额外依赖项?
我正在尝试使用指令在
https://spark.apache.org/docs/latest/api/python/getting_started/install.html#us-pypi
我可以看到额外的依赖性可用,例如SQL和PANDAS_ON_SPARK使用
PIP安装Pyspark [SQL,PANDAS_ON_SPARK]
但是我们如何找到所有可用的额外功能?
查看Pyspark软件包的JSON(基于
https://pypi.org/pypi/pypi/pyspar/pyspark/json
找不到可能的额外依赖性(如 pypi依赖性中的'额外'是什么? /a>);需求_dist的值为null。
非常感谢您的帮助。
I am trying to deploy PySpark locally using the instructions at
https://spark.apache.org/docs/latest/api/python/getting_started/install.html#using-pypi
I can see that extra dependencies are available, such as sql and pandas_on_spark that can be deployed with
pip install pyspark[sql,pandas_on_spark]
But how can we find all available extras?
Looking in the json of the pyspark package (based on https://wiki.python.org/moin/PyPIJSON)
https://pypi.org/pypi/pyspark/json
I could not find the possible extra dependencies (as described in What is 'extra' in pypi dependency?); the value for requires_dist is null.
Many thanks for your help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
据我所知,你无法轻易获得额外的列表。。如果此列表没有明确记录,那么您将必须查看打包的代码/配置。在这种情况下,此处给出了以下列表:
ml
、mllib
、sql
和pandas_on_spark
。As far as I know, you can not easily get the list of extras. If this list is not clearly documented, then you will have to look at the code/config for the packaging. In this case, here which gives the following list:
ml
,mllib
,sql
, andpandas_on_spark
.