运行bash脚本以下载Databrick中的SQL Server ODBC驱动程序失败
我有看起来像这样的bash脚本,
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17
python -m pip install --upgrade pip
pip install twine keyring artifacts-keyring
pip install -r requirements.txt
我基本上只是尝试安装SQL Server,然后运行一些Python命令。
我正在尝试在数据链球群集上运行此操作。
当我这样做时,
%sh
bash <path-to-bash-script.sh>
或者
%sh
sh <path-to-bash-script.sh>
在尝试下载驱动程序时会遇到错误,
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 983 100 983 0 0 12287 0 --:--:-- --:--:-- --:--:-- 12287
Warning: apt-key output should not be parsed (stdout is not a terminal)
gpg: invalid option "-
"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 79 100 79 0 0 975 0 --:--:-- --:--:-- --:--:-- 975
E: Invalid operation update
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package msodbcsql17
请注意:我正在本地创建此文件作为项目的一部分,然后我有一个CICD管道,将文件复制到数据助理工作空间中。
但是,当我在此文件中获取命令并使用%sh
在单元格中运行时,它将运行而没有问题。
这里的问题到底是什么?
I have bash script that looks something like this,
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17
python -m pip install --upgrade pip
pip install twine keyring artifacts-keyring
pip install -r requirements.txt
I am basically just trying to install a SQL Server and then running some Python commands.
I am trying to run this on a Databricks cluster.
When I do,
%sh
bash <path-to-bash-script.sh>
Or
%sh
sh <path-to-bash-script.sh>
I get an error when trying to download the driver,
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 983 100 983 0 0 12287 0 --:--:-- --:--:-- --:--:-- 12287
Warning: apt-key output should not be parsed (stdout is not a terminal)
gpg: invalid option "-
"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 79 100 79 0 0 975 0 --:--:-- --:--:-- --:--:-- 975
E: Invalid operation update
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package msodbcsql17
Note: I am creating this file locally as part of a project and then I have a CICD pipeline that copies the file into a Databricks workspace.
However, when I take the commands in this file and just run it within a cell using %sh
, it runs without an issue.
What exactly is the problem here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
但是,这背后的原因并不完全清楚,我最好的猜测是如下所示,
我如何通过使用
dbutils
来简单地创建数据abtricks工作区内的文件来解决此问题。例如,这无问题运行,这似乎是创建要在群集上运行的任何初始脚本的推荐方法。
不利的一面是,您不能精确地控制这些脚本,并且每次需要更改时都需要覆盖它们。
The reason behind this is not entirely clear, however, my best guesses are as follows,
How I was able to resolve this issue is by simple creating the file inside of the Databricks workspace by using
dbutils
. For example,This runs without an issue and it seems to be the recommended way to create any init scripts that you want to run on your clusters.
The downside is that you can't exactly version control these scripts and will require them to be overwritten each time a change is required.