用通配符匹配 2 个列表的算法
我正在寻找一种有效的方法来匹配两个列表,一个包含完整信息,另一个包含通配符。我已经能够使用固定长度的通配符来做到这一点,但现在我尝试使用可变长度的通配符来做到这一点。
因此:
match( ['A', 'B', '*', 'D'], ['A', 'B', 'C', 'C', 'C', 'D'] )
只要所有元素在两个列表中的顺序相同,就会返回 True。
我正在使用对象列表,但为了简单起见,使用了上面的字符串。
I'm looking for an efficient way to match 2 lists, one wich contains complete information, and one which contains wildcards. I've been able to do this with wildcards of fixed lengths, but am now trying to do it with wildcards of variable lengths.
Thus:
match( ['A', 'B', '*', 'D'], ['A', 'B', 'C', 'C', 'C', 'D'] )
would return True as long as all the elements are in the same order in both lists.
I'm working with lists of objects, but used strings above for simplicity.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
[编辑以证明在比较对象的OP评论之后没有RE]
看来您没有使用字符串,而是比较对象。因此,我给出了一个显式算法——正则表达式为字符串提供了一个很好的解决方案,不要误会我的意思,但是从你对问题的评论来看,似乎一个显式的、简单的算法可能会让你的事情变得更容易。
事实证明,这可以用比之前的答案更简单的算法来解决:
关键的想法是,当你遇到通配符,您可以探索两个选项:
[edited to justify no RE after OP comment on comparing objects]
It appears you are not using strings, but rather comparing objects. I am therefore giving an explicit algorithm — regular expressions provide a good solution tailored for strings, don't get me wrong, but from what you say as a comment to your questions, it seems an explicit, simple algorithm may make things easier for you.
It turns out that this can be solved with a much simpler algorithm than this previous answer:
The key idea is that when you encounter a wildcard, you can explore two options :
下面怎么样:
它使用正则表达式。通配符 (
*
) 更改为.*
,所有其他搜索词保持原样。需要注意的是,如果您的搜索词可能包含在正则表达式语言中具有特殊含义的内容,则需要正确转义这些内容。在
match
函数中处理这个问题非常容易,我只是不确定这是否是您所需要的。How about the following:
It uses regular expressions. Wildcards (
*
) are changed to.*
and all other search terms are kept as-is.One caveat is that if your search terms could contain things that have special meaning in the regex language, those would need to be properly escaped. It's pretty easy to handle this in the
match
function, I just wasn't sure if this was something you required.我建议将
['A', 'B', '*', 'D']
转换为'^AB.*D$'
,[ 'A', 'B', 'C', 'C', 'C', 'D']
到'ABCCCD'
,然后使用re
模块(正则表达式)进行匹配。如果列表的元素每个只有一个字符,并且它们是字符串,则这将是有效的。
例如:
如果列表包含数字或单词,请选择您不希望出现的字符,例如
#
。然后['Aa','Bs','Ce','Cc','CC','Dd']
可以转换为Aa#Bs#Ce#Cc#CC# Dd
,通配符模式['Aa','Bs','*','Dd']
可以转换为^Aa#Bs#.*#Dd$
,并执行匹配。实际上,这仅意味着
myMatch
中的所有''.join(...)
都变为'#'.join(...)
。I'd recommend converting
['A', 'B', '*', 'D']
to'^AB.*D$'
,['A', 'B', 'C', 'C', 'C', 'D']
to'ABCCCD'
, and then using there
module (regular expressions) to do the match.This will be valid if the elements of your lists are only one character each, and if they're strings.
something like:
If the lists contain numbers, or words, choose a character that you wouldn't expect to be in either, for example
#
. Then['Aa','Bs','Ce','Cc','CC','Dd']
can be converted toAa#Bs#Ce#Cc#CC#Dd
, the wildcard pattern['Aa','Bs','*','Dd']
could be converted to^Aa#Bs#.*#Dd$
, and the match performed.Practically speaking this just means all the
''.join(...)
becomes'#'.join(...)
inmyMatch
.我同意关于这可以用正则表达式来完成的评论。例如:
编辑:正如评论所指出的,可能提前知道必须匹配某个字符,但不知道是哪个字符。在这种情况下,正则表达式仍然有用:
I agree with the comment regarding this could be done with regular expressions. For example:
Edit: As pointed out by a comment, it might be known in advance just that some character has to be matched, but not which one. In that case, regular expressions are useful still:
我同意,正则表达式通常是处理此类事情的方法。这个算法是有效的,但对我来说它看起来很复杂。不过写起来很有趣。
I agree, regular expressions are usually the way to go with this sort of thing. This algorithm works, but it just looks convoluted to me. It was fun to write though.
我有这段 C++ 代码,它似乎正在做您想做的事情(输入是字符串而不是字符数组,但无论如何您都必须调整内容)。
这并不是我真正感到自豪的事情,但到目前为止似乎正在发挥作用。我希望你会发现它很有用。
I had this c++ piece of code which seems to be doing what you are trying to do (inputs are strings instead of arrays of characters but you'll have to adapt stuff anyway).
It's not something I'm really proud of but it seems to be working so far. I hope you can find it useful.