Bash 中的数组交集
如何在 Bash 中比较两个数组以找到所有相交的值?
比方说:
array1 包含值 1 和 2
array2 包含值 2 和 3
我应该返回 2 作为结果。
我自己的答案:
for item1 in $array1; do
for item2 in $array2; do
if [[ $item1 = $item2 ]]; then
result=$result" "$item1
fi
done
done
我也在寻找替代解决方案。
How do you compare two arrays in Bash to find all intersecting values?
Let's say:
array1 contains values 1 and 2
array2 contains values 2 and 3
I should get back 2 as a result.
My own answer:
for item1 in $array1; do
for item2 in $array2; do
if [[ $item1 = $item2 ]]; then
result=$result" "$item1
fi
done
done
I'm looking for alternate solutions as well.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
列表 1 的元素作为正则表达式在列表 2 中查找(表示为字符串: ${list2[*]} ):
结果为
The elements of list 1 are used as regular expression looked up in list2 (expressed as string: ${list2[*]} ):
The result is
采用 @Raihan 的答案并使其适用于非文件(尽管创建了 FD)
我知道这有点作弊,但似乎是一个不错的替代方案
副作用是输出数组将按字典顺序排序,希望没关系
(也不知道你有什么类型的数据,所以我只是用数字进行测试,如果你有带有特殊字符的字符串等,可能需要额外的工作)
测试:
ps我确信有一种方法可以获取数组要在没有
for
循环的情况下每行输出一个值,我只是忘记了它(IFS?)Taking @Raihan's answer and making it work with non-files (though FDs are created)
I know it's a bit of a cheat but seemed like good alternative
Side effect is that the output array will be lexicographically sorted, hope thats okay
(also don't kno what type of data you have, so I just tested with numbers, there may be additional work needed if you have strings with special chars etc)
Testing:
p.s. I'm sure there was a way to get the array to out one value per line w/o the
for
loop, I just forget it (IFS?)您的答案不起作用,原因有两个:
$array1
只是扩展到array1
的第一个元素。 (至少,在我安装的 Bash 版本中,它是这样工作的。这似乎没有记录在案的行为,因此它可能是一个与版本相关的怪癖。)result 后,
result
将包含一个空格,因此下一次运行result=$result" "$item1
将出现严重错误。 (它不会附加到result
,而是运行由前两项组成的命令,并将环境变量result
设置为空字符串。)更正:事实证明,我在这一点上错了:分词不会发生在作业内部。 (请参阅下面的评论。)您想要的是这样的:
Your answer won't work, for two reasons:
$array1
just expands to the first element ofarray1
. (At least, in my installed version of Bash that's how it works. That doesn't seem to be a documented behavior, so it may be a version-dependent quirk.)result
,result
will then contain a space, so the next run ofresult=$result" "$item1
will misbehave horribly. (Instead of appending toresult
, it will run the command consisting of the first two items, with the environment variableresult
being set to the empty string.) Correction: Turns out, I was wrong about this one: word-splitting doesn't take place inside assignments. (See comments below.)What you want is this:
如果您要查找相交线的是两个文件(而不是数组),则可以使用
comm
命令。If it was two files (instead of arrays) you were looking for intersecting lines, you could use the
comm
command.现在我明白了“数组”的含义,我认为——首先——你应该考虑使用实际的 Bash 数组。它们更加灵活,例如,数组元素可以包含空格,并且您可以避免
*
和?
触发文件名扩展的风险。但如果您更喜欢使用现有的空格分隔字符串方法,那么我同意 RHT 使用 Perl 的建议:(
换行符只是为了可读性;如果您愿意,您可以删除它们。)
在上面的 Bash 中命令时,嵌入式 Perl 程序创建一个名为
%array2
的哈希,其中包含第二个数组的元素,然后打印%array2
中存在的第一个数组的所有元素。这与您的代码在处理第二个数组中的重复值的方式上略有不同;在您的代码中,如果
array1
包含x
两次,array2
包含x
三次,则result< /code> 将包含
x
六次,而在我的代码中,result
将仅包含x
两次。我不知道这是否重要,因为我不知道你的具体要求。Now that I understand what you mean by "array", I think -- first of all -- that you should consider using actual Bash arrays. They're much more flexible, in that (for example) array elements can contain whitespace, and you can avoid the risk that
*
and?
will trigger filename expansion.But if you prefer to use your existing approach of whitespace-delimited strings, then I agree with RHT's suggestion to use Perl:
(The line-breaks are just for readability; you can get rid of them if you want.)
In the above Bash command, the embedded Perl program creates a hash named
%array2
containing the elements of the second array, and then it prints any elements of the first array that exist in%array2
.This will behave slightly differently from your code in how it handles duplicate values in the second array; in your code, if
array1
containsx
twice andarray2
containsx
three times, thenresult
will containx
six times, whereas in my code,result
will containx
only twice. I don't know if that matters, since I don't know your exact requirements.