通过 shell 脚本将文件中的参数以正确的格式获取到 python 脚本中

发布于 2024-09-10 15:32:34 字数 1217 浏览 2 评论 0原文

我有以下 shell 脚本:

#! /bin/sh

while read page_section
  page=${page_section%%\ *}
  section=${page_section#* }     #NOTE: `%* }` is NOT a comment

  wget --quiet --no-proxy www.cs.sun.ac.za/hons/$page -O html.tmp & wait

#  echo ${page_section%%\ *} # verify correct string chopping
#  echo ${page_section#* }   # verify correct string chopping

  ./DokuWikiHtml2Latex.py html.tmp $section & wait
done < inputfile

和一个像这样的输入文件:

doku.php?id=ndewet:tools:tramonitor TraMonitor
doku.php?id=ndewet:description Implementation -1
doku.php?id=ndewet:description Research\ Areas -1

该脚本下载 inputfile 中指定的许多网页,然后必须传递其余行(例如“Implementation -1”或“Research”) \ Areas -1") 到 python 脚本。

现在来说说粘性的部分。处理此示例文件的第三行时,它将“Research\ Areas”作为两个单独的参数传递给 python 脚本,如下所示:

>>> print sys.argv
['./DokuWikiHtml2Latex.py', 'html.tmp', 'Research', 'Areas', '-1']

How can I get a multi wordsection, like "Research Areas" from the input file into a python 脚本的单个参数?我尝试过转义“\”,也尝试过执行

./DokuWikiHtml2Latex.py html.tmp `echo ${section#* }`

其他操作,但无济于事。

输入行末尾的数字是另一个参数,但可选。

I have the following shell script:

#! /bin/sh

while read page_section
  page=${page_section%%\ *}
  section=${page_section#* }     #NOTE: `%* }` is NOT a comment

  wget --quiet --no-proxy www.cs.sun.ac.za/hons/$page -O html.tmp & wait

#  echo ${page_section%%\ *} # verify correct string chopping
#  echo ${page_section#* }   # verify correct string chopping

  ./DokuWikiHtml2Latex.py html.tmp $section & wait
done < inputfile

And an input file like this:

doku.php?id=ndewet:tools:tramonitor TraMonitor
doku.php?id=ndewet:description Implementation -1
doku.php?id=ndewet:description Research\ Areas -1

The script downloads a number of webpages spesified in inputfile and must then pass the rest of line (eg. "Implementation -1" or "Research\ Areas -1") to the python script.

Now for the sticky bit. When the third line of this example file is processed it passes "Research\ Areas" to the python script as two separate arguments, as confirmed by:

>>> print sys.argv
['./DokuWikiHtml2Latex.py', 'html.tmp', 'Research', 'Areas', '-1']

How can I get a multi word section, like "Research Areas" from the input file into a single argument for the python script? I've tried escaping the '\', and also doing

./DokuWikiHtml2Latex.py html.tmp `echo ${section#* }`

among other things, but to no avail.

The number at the end of an input line is another argument, but optional.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

千年*琉璃梦 2024-09-17 15:32:34

在 $section 周围加上引号:

./DokuWikiHtml2Latex.py html.tmp "$section" & wait

Put quotes around $section:

./DokuWikiHtml2Latex.py html.tmp "$section" & wait
可可 2024-09-17 15:32:34

只需让 read 进行解析即可:

while read page section rest
do
    echo "Page: $page"
    echo "Section: $section"
done < inputfile

为了优雅地处理可选参数,请使用数组:

while read -a fields
do
    wget --quiet --no-proxy "www.cs.sun.ac.za/hons/${fields[0]}" -O html.tmp
    unset "fields[0]"
    ./DokuWikiHtml2Latex.py html.tmp "${fields[@]}"
done < inputfile

始终引用您的变量!

Just let read do the parsing stuff:

while read page section rest
do
    echo "Page: $page"
    echo "Section: $section"
done < inputfile

For handling the optional argument elegantly, use an array:

while read -a fields
do
    wget --quiet --no-proxy "www.cs.sun.ac.za/hons/${fields[0]}" -O html.tmp
    unset "fields[0]"
    ./DokuWikiHtml2Latex.py html.tmp "${fields[@]}"
done < inputfile

Always quote your variables!

清醇 2024-09-17 15:32:34

通常,可以使用引号将多字参数作为一个传递,因此:

doku.php?id=ndewet:description "Research Areas" -1

Normally multi-word arguments can be passed as one by using quotes, so:

doku.php?id=ndewet:description "Research Areas" -1
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文