运行bash脚本以下载Databrick中的SQL Server ODBC驱动程序失败

发布于 2025-02-13 11:19:12 字数 1712 浏览 0 评论 0原文

我有看起来像这样的bash脚本,

curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17
python -m pip install --upgrade pip
pip install twine keyring artifacts-keyring
pip install -r requirements.txt

我基本上只是尝试安装SQL Server,然后运行一些Python命令。

我正在尝试在数据链球群集上运行此操作。

当我这样做时,

%sh
bash <path-to-bash-script.sh>

或者

%sh
sh <path-to-bash-script.sh>

在尝试下载驱动程序时会遇到错误,

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   983  100   983    0     0  12287      0 --:--:-- --:--:-- --:--:-- 12287
Warning: apt-key output should not be parsed (stdout is not a terminal)
gpg: invalid option "-
"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    79  100    79    0     0    975      0 --:--:-- --:--:-- --:--:--   975
E: Invalid operation update
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package msodbcsql17

请注意:我正在本地创建此文件作为项目的一部分,然后我有一个CICD管道,将文件复制到数据助理工作空间中。

但是,当我在此文件中获取命令并使用%sh在单元格中运行时,它将运行而没有问题。

这里的问题到底是什么?

I have bash script that looks something like this,

curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17
python -m pip install --upgrade pip
pip install twine keyring artifacts-keyring
pip install -r requirements.txt

I am basically just trying to install a SQL Server and then running some Python commands.

I am trying to run this on a Databricks cluster.

When I do,

%sh
bash <path-to-bash-script.sh>

Or

%sh
sh <path-to-bash-script.sh>

I get an error when trying to download the driver,

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   983  100   983    0     0  12287      0 --:--:-- --:--:-- --:--:-- 12287
Warning: apt-key output should not be parsed (stdout is not a terminal)
gpg: invalid option "-
"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    79  100    79    0     0    975      0 --:--:-- --:--:-- --:--:--   975
E: Invalid operation update
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package msodbcsql17

Note: I am creating this file locally as part of a project and then I have a CICD pipeline that copies the file into a Databricks workspace.

However, when I take the commands in this file and just run it within a cell using %sh, it runs without an issue.

What exactly is the problem here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

只是我以为 2025-02-20 11:19:12

但是,这背后的原因并不完全清楚,我最好的猜测是如下所示,

  1. 它与文件中创建的隐藏字符有关。例如,Windows可能正在添加运输返回而不是新行,这可能会影响文件的执行。
  2. 它与文件权限有关(检查文件上的权限后,似乎并非如此)。

我如何通过使用dbutils来简单地创建数据abtricks工作区内的文件来解决此问题。例如,

dbutils.fs.put("dbfs:/scripts/install_dependencies.sh","""
#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
apt-get update
ACCEPT_EULA=Y apt-get -q -y install msodbcsql17""", True)

这无问题运行,这似乎是创建要在群集上运行的任何初始脚本的推荐方法。

不利的一面是,您不能精确地控制这些脚本,并且每次需要更改时都需要覆盖它们。

The reason behind this is not entirely clear, however, my best guesses are as follows,

  1. It has something to do with hidden characters in the file that is created locally. For instance, Windows might be adding carriage returns instead of new lines and this could be affecting the execution of the file.
  2. It has something to do with file permissions (upon checking the permissions on the file, however, this does not seem to be the case).

How I was able to resolve this issue is by simple creating the file inside of the Databricks workspace by using dbutils. For example,

dbutils.fs.put("dbfs:/scripts/install_dependencies.sh","""
#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
apt-get update
ACCEPT_EULA=Y apt-get -q -y install msodbcsql17""", True)

This runs without an issue and it seems to be the recommended way to create any init scripts that you want to run on your clusters.

The downside is that you can't exactly version control these scripts and will require them to be overwritten each time a change is required.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文