如何从 bash CGI 脚本解析 $QUERY_STRING?
我有一个在 CGI 中使用的 bash 脚本。 CGI 通过读取 URL 中 ?
后面的所有内容来设置 $QUERY_STRING
环境变量。例如,http://example.com?a=123&b=456& c=ok 设置QUERY_STRING=a=123&b=456&c=ok
。
我在某个地方发现了以下丑陋之处:
b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")
会将 $b 设置为 $QUERY_STRING 中为 b
找到的内容。然而,我的脚本已经增长到有十多个输入参数。有没有更简单的方法来自动将 $QUERY_STRING 中的参数转换为 bash 可以使用的环境变量?
也许我只会使用某种 for 循环,但如果脚本足够智能,能够自动检测每个参数,并且可能构建一个看起来像这样的数组,那就更好了:
${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok
我如何编写代码来做到这一点?
I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING
environment variable by reading everything after the ?
in the URL. For example, http://example.com?a=123&b=456&c=ok sets QUERY_STRING=a=123&b=456&c=ok
.
Somewhere I found the following ugliness:
b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")
which will set $b to whatever was found in $QUERY_STRING for b
. However, my script has grown to have over ten input parameters. Is there an easier way to automatically convert the parameters in $QUERY_STRING into environment variables usable by bash?
Maybe I'll just use a for loop of some sort, but it'd be even better if the script was smart enough to automatically detect each parameter and maybe build an array that looks something like this:
${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok
How could I write code to do that?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
试试这个:
现在你有了这个:
在具有关联数组的 Bash 4 中,你可以这样做(使用上面创建的数组):
这将为你提供:
编辑:
在 Bash 2 中使用间接以及稍后(使用上面创建的
parm
数组):然后您将:
您可以直接访问这些:
或间接:
如果可能,最好 避免间接,因为它会使代码变得混乱并成为错误的来源。
Try this:
Now you have this:
In Bash 4, which has associative arrays, you can do this (using the array created above):
which will give you this:
Edit:
To use indirection in Bash 2 and later (using the
parm
array created above):Then you will have:
You can access these directly:
or indirectly:
If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.
您可以使用
IFS
分解$QUERY
。例如,将其设置为&
并且您可以在 Bash 4+ 中保存到哈希/字典
you can break
$QUERY
down usingIFS
. For example, setting it to&
And you can save to a hash/dictionary in Bash 4+
请不要使用邪恶的评估垃圾。
以下是如何可靠地解析字符串并获取关联数组的方法:
如果您不喜欢键检查,您可以这样做:
列出数组中的所有键和值:
Please don't use the evil eval junk.
Here's how you can reliably parse the string and get an associative array:
If you don't like the key check, you could do this instead:
Listing all the keys and values from the array:
要将 QUERY_STRING 的内容转换为 bash 变量,请使用以下命令:
内部步骤
echo ${QUERY_STRING//&/;}
将所有与号替换为分号,生成 a=123;b=456 ;c=ok,eval
然后将其计算到当前 shell 中。然后结果可以用作 bash 变量。
假设是:
To converts the contents of QUERY_STRING into bash variables use the following command:
The inner step,
echo ${QUERY_STRING//&/;}
, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which theeval
then evaluates into the current shell.The result can then be used as bash variables.
The assumptions are:
我将 sed 命令打包到另一个脚本中:
$cat getvar.sh
并从我的主 cgi 中将其调用为:
...etc 等 - 你明白了。
即使使用非常基本的 busybox 设备也适用于我(在这种情况下是我的 PVR)。
I packaged the sed command up into another script:
$cat getvar.sh
and I call it from my main cgi as:
...etc, etc - you get idea.
works for me even with a very basic busybox appliance (my PVR in this case).
虽然接受的答案可能是最漂亮的答案,但在某些情况下,安全性可能非常重要,并且它也需要从脚本中清晰可见。
在这种情况下,首先我不会使用 bash 来完成任务,但如果出于某种原因应该这样做,最好避免这些新的数组 - 字典功能,因为你无法确定,到底如何他们逃走了。
在这种情况下,好的旧原始解决方案可能会起作用:
它会迭代
QUERY_STRING
的名称-值对,并且无法使用任何棘手的转义序列 -来规避它。
在 bash 中是一个非常强大的东西,除了单个变量名称替换之外,它完全由我们控制,没有什么可以被欺骗此外,你可以将自己的处理代码注入到“
# ...<”中 。 /code>"。这使您能够只允许您自己的、定义明确(并且最好是简短的)的允许变量名称列表。不用说,
LD_PRELOAD
不应该是其中之一;-)此外,不会导出任何变量,并且仅导出
QS
、nameval
、name
和val
。用过的。While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.
In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.
In this case, the good old primitive solutions might work:
This iterates on the name-value pairs of the
QUERY_STRING
, and there is no way to circumvent it with any tricky escape sequence - the"
is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.Furthermore, you can inject your own processing code into "
# ...
". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say,LD_PRELOAD
shouldn't be one of them. ;-)Furthermore, no variable will be exported, and exclusively
QS
,nameval
,name
andval
is used.按照正确的答案,我自己做了一些更改以支持数组变量,如 这是另一个问题。我还添加了一个解码功能,但我找不到作者给予一些信任。
代码看起来有些乱,但是可以用。更改和其他建议将不胜感激。
Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.
Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.
我会简单地替换 &到 ;。它会变成这样:
所以现在你只需要评估并读取你的变量:
I would simply replace the & to ;. It will become to something like:
So now you need just evaluate and read your vars:
可以使用bash-cgi.sh,它处理:
将查询字符串放入$QUERY_STRING_GET键和值数组中;
将 post 请求数据 (x-www-form-urlencoded) 放入 $QUERY_STRING_POST 键和值数组中;
将 cookies 数据放入 $HTTP_COOKIES 键和值数组中。
需要 bash 版本 4.0 或更高版本(定义上面的键和值数组)。
所有处理仅由 bash 进行(即在一个进程中),没有任何外部依赖项和额外的进程调用。
它具有:
检查可以传输到其输入的数据的最大长度,
以及作为查询字符串和 cookies 进行处理;
redirect() 过程产生到自身的重定向,扩展名更改为 .html(这对于单页站点很有用);
http_header_tail() 过程输出 HTTP(S) 响应标头的最后两个字符串;
来自可能注入的 $REMOTE_ADDR 值清理器;
嵌入到传递给 $QUERY_STRING_GET、$QUERY_STRING_POST 和 $HTTP_COOKIES 的值中的转义 UTF-8 符号的解析器和评估器;
针对可能的 SQL 注入的 $QUERY_STRING_GET、$QUERY_STRING_POST 和 $HTTP_COOKIES 值的清理程序(像 mysql_real_escape_string php 函数所做的转义,加上 @ 和 $ 的转义)。
它可以在这里找到:
https://github.com/VladimirBelousov/fancy_scripts
One can use the bash-cgi.sh, which processes :
the query string into the $QUERY_STRING_GET key and value array;
the post request data (x-www-form-urlencoded) into the $QUERY_STRING_POST key and value array;
the cookies data into the $HTTP_COOKIES key and value array.
Demands bash version 4.0 or higher (to define the key and value arrays above).
All processing is made by bash only (i.e. in an one process) without any external dependencies and additional processes invoking.
It has:
the check for max length of data, which can be transferred to it's input,
as well as processed as query string and cookies;
the redirect() procedure to produce redirect to itself with the extension changed to .html (it is useful for an one page's sites);
the http_header_tail() procedure to output the last two strings of the HTTP(S) respond's header;
the $REMOTE_ADDR value sanitizer from possible injections;
the parser and evaluator of the escaped UTF-8 symbols embedded into the values passed to the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES;
the sanitizer of the $QUERY_STRING_GET, $QUERY_STRING_POST and $HTTP_COOKIES values against possible SQL injections (the escaping like the mysql_real_escape_string php function does, plus the escaping of @ and $).
It is available here:
https://github.com/VladimirBelousov/fancy_scripts
处理 CGI 查询字符串的一个好方法是使用 Haserl ,它充当 Bash cgi 脚本的包装器,并且提供方便且安全的查询字符串解析。
A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.
这在破折号中使用 for in 循环有效
This works in dash using for in loop
为了使其保持最新,如果您有最新的 Bash 版本,那么您可以使用正则表达式来实现此目的:
请注意,参数最终以相反的顺序出现在数组中(它是关联的,因此应该无关紧要)。
declare -A params while [[ $q =~ $re1 ]]; do q=${q##*${BASH_REMATCH[0]}} [[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]}) done如果您不想使用关联数组,则只需更改倒数第二行即可执行您想要的操作。对于循环的每次迭代,参数位于
${BASH_REMATCH[1]}
中,其值位于${BASH_REMATCH[2]}
中。这与简短测试脚本中的函数相同,该函数迭代数组输出查询字符串的参数及其值
请注意,参数最终以相反的顺序出现在数组中(它是关联的,因此应该无关紧要)。
To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:
Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).
declare -A params while [[ $q =~ $re1 ]]; do q=${q##*${BASH_REMATCH[0]}} [[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]}) doneIf you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in
${BASH_REMATCH[1]}
and its value is in${BASH_REMATCH[2]}
.Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values
Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).
为什么不
现在你有这个
why not this
now you have this
@giacecco
要在正则表达式中包含 hiphen,您可以更改 @starfry 的回答中的两行。
更改这两行:
更改为这两行:
@giacecco
To include a hiphen in the regex you could change the two lines as such in answer from @starfry.
Change these two lines:
To these two lines:
对于所有无法使用发布的答案的人(比如我),
这个人 想通了。
不幸的是,无法对他的帖子进行投票...
让我快速地在这里重新发布代码:
希望这对任何人都有帮助。
干杯
For all those who couldn't get it working with the posted answers (like me),
this guy figured it out.
Can't upvote his post unfortunately...
Let me repost the code here real quick:
Hope this is of help for anybody.
Cheers
实际上我喜欢 bolt 的答案,所以我制作了一个也可以与 Busybox 一起使用的版本(Busybox 中的 ash 不支持这里是字符串)。
此代码将接受 key1 和 key2 参数,所有其他参数将被忽略。
Actually I liked bolt's answer, so I made a version which works with Busybox as well (ash in Busybox does not support here string).
This code will accept key1 and key2 parameters, all others will be ignored.