Bash 中的字符串比较优先级
以下示例将比较目录中的所有文件与输入字符串 ($string) 并返回匹配的文件名。这不是实现这一目标的非常优雅和有效的方法。为了提高速度,我修改了 for
条件,使其仅与以 $string 的第一个单词开头的文件进行比较。
该脚本的问题如下 - 我的目录中有两个文件:
Foo Bar.txt
Foo Bar Foo.txt
我将它们与字符串“Foo Bar 09.20.2010”进行比较。这将返回该目录中的两个文件,因为两个文件都匹配。但我只需要返回与字符串最精确匹配的文件 - 在我们的示例中,它应该是 Foo Bar.txt。
另外,如果您有更好的想法如何解决这个问题,请发表您的想法,因为我还不太精通脚本编写,而且我确信有更好、甚至更简单的方法来做到这一点。
#!/bin/bash
string="Foo Bar 09.20.2010"
for file in /path/to/directory/$(echo "$string" | awk '{print $1}')*; do
filename="${file##*/}"
filename="${filename%.*}"
if [[ $(echo "$string" | grep -i "^$filename") ]]; then
result="$file"
echo $result
fi
done
这是我想要实现的目标的细分。目录中的两个文件与两个字符串匹配,括号中的正确/不正确意味着结果是否符合我的预期/想要的。
目录中的 2 个文件(去掉扩展名以进行匹配):
Foo Bar.txt
Foo Bar Foo.txt
与 2 个字符串进行比较:
Foo Bar Random Additional Text
Foo Bar Foo Random Additional Text
结果:
compare "Foo Bar"(.txt) against Foo Bar Random Additional Text -> Match (Correct)
compare "Foo Bar"(.txt) against Foo Bar Foo Random Additional Text -> Match (Incorrect)
compare "Foo Bar Foo"(.txt) against Foo Bar Random Additional Text -> NOT Match (Correct)
compare "Foo Bar Foo"(.txt) against Foo Bar Foo Random Additional Text -> Match (Correct)
谢谢大家的回答。
The following example will compare all files in a directory to input string ($string) and return matching filename. It is not very elegant and efficient way of accomplishing that. For speed purposes I modified for
condition to only compare to files that start start with first word of $string.
Problem with this script is following - I have two files in the directory:
Foo Bar.txt
Foo Bar Foo.txt
and I compare them to string "Foo Bar 09.20.2010"
. This will return both files in that directory, as both files match. But I need to return only the file that matches the string in most exact way - in our example it should be Foo Bar.txt
.
Also if you have better ideas how to solve this problem please post your ideas as I am not that proficient in scripting yet and I am sure there are better and maybe even easier ways of doing this.
#!/bin/bash
string="Foo Bar 09.20.2010"
for file in /path/to/directory/$(echo "$string" | awk '{print $1}')*; do
filename="${file##*/}"
filename="${filename%.*}"
if [[ $(echo "$string" | grep -i "^$filename") ]]; then
result="$file"
echo $result
fi
done
Here is breakdown what I want to achieve. Two files in directory to match against two strings, Correct/Incorrect in brackets means if result was as I expected/wanted or not.
2 Files In directory (stripped off extensions for matching):
Foo Bar.txt
Foo Bar Foo.txt
To compare against 2 Strings:
Foo Bar Random Additional Text
Foo Bar Foo Random Additional Text
Results:
compare "Foo Bar"(.txt) against Foo Bar Random Additional Text -> Match (Correct)
compare "Foo Bar"(.txt) against Foo Bar Foo Random Additional Text -> Match (Incorrect)
compare "Foo Bar Foo"(.txt) against Foo Bar Random Additional Text -> NOT Match (Correct)
compare "Foo Bar Foo"(.txt) against Foo Bar Foo Random Additional Text -> Match (Correct)
Thank you everyone for your answers.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果我错了,请纠正我,但您的脚本似乎相当于:
如果您只想要其中一个文件名,则可以使用
head
。由于 ls 按字母顺序列出文件,因此您将按字母顺序获得第一个文件。(请注意,当
ls
的输出通过管道传输到另一个程序时,它每行打印一个文件名,使其比正常的基于列的输出更容易处理。)对于最短 match 尝试类似下面的内容,它使用了
awk
、sort -n
和cut
的尴尬组合来从最短的位置对行进行排序最长,然后打印第一个。Correct me if I'm wrong, but it appears that your script is equivalent to:
If you only want one file name out of it, you can use
head
. Sincels
lists files alphabetically you'll get the first one in alphabetical order.(Notice that when
ls
's output is piped to another program it prints one file name per line, making it easier to process than its normal column-based output.)For the shortest match try something like the following, which uses an awkward combination of
awk
,sort -n
, andcut
to order the lines from shortest to longest and then print the first one.许多
echo
和awk
调用都是多余的。要获取以您的匹配开头的所有文件,您可以简单地计算“$string”*。例如
和
都会生成您的列表。 (在管道中, echo 将使它们以空格分隔, ls 将使它们以换行符分隔)。
下一步是认识到,正如您所定义的那样,“最精确匹配”的额外约束相当于最短匹配文件名。
要在 bash 中查找一组字符串中最短的字符串(我自己更喜欢使用 perl,但让我们坚持在 bash 中执行此操作的限制):
for 循环在扩展的文件名上循环。回显将名称的长度添加到名称前面。然后,我们将其整个输出通过管道传输到
sort -n
和head -1
中以获得最短的名称,并cut -f2- -d' ' 去掉它的长度(将第二个字段用空格作为字段分隔符)。
shell 编程的关键是了解您的构建块以及如何组合它们。通过排序、头部、尾部和切割的巧妙组合,您可以进行许多非常复杂的处理。添加 sed 和 uniq,您已经能够做一些令人印象深刻的事情了。
话虽这么说,我通常只将 shell 用于诸如“即时”之类的事情 - 对于任何我可能想要重复使用且非常复杂的事情,我更有可能使用 perl。
A lot of your
echo
andawk
calls are superfluous. To get all the files that begin with your matching, you can simply evaluate "$string"*.e.g. both
and
Will generate your lists. (In a pipe, echo will have them space-separated, and ls will have them newline-separated).
The next step is to realize that given this, as you have defined it, your extra constraint of "most exact match" is equivalent to the shortest matching filename.
To find the shortest string in a set of strings in bash (I'd prefer perl myself, but let's stick with the constraint of doing it in bash):
The for loop loops over the expanded filenames. The echo prepends the length of the names to the names. We then pipe the entire output of that into
sort -n
andhead -1
to get the shortest name, andcut -f2- -d' '
strips the length off of it (taking the second field on with a space as the field separator).The key with shell programming is knowing your building blocks, and how to combine them. With clever combinations of sort, head, tail, and cut you can do a lot of pretty sophisticated processing. Throw in sed and uniq and you are already able to do some quite impressive things.
That being said, I usually only use the shell for things like this "on-the-fly" -- for anything that I might want to re-use and that is at all complex I would be much more likely to use perl.