如何检查另一个字符串中是否存在完全相同的字符串?
我目前遇到了一些问题。我正在尝试编写一个程序,该程序将突出显示另一个字符串中出现的单词或短语,但前提是它所匹配的字符串完全相同。我遇到麻烦的部分是确定我与该短语匹配的子短语是否包含在另一个较大的子短语中。
显示此问题的一个简单示例:
>>> indicators = ["therefore", "for", "since"]
>>> phrase = "... therefore, I conclude I am awesome."
>>> indicators_in_phrase = [indicator for indicator in indicators
if indicator in phrase.lower()]
>>> print indicators_in_phrase
['therefore', 'for']
我不希望“for”包含在该列表中。我知道为什么要包含它,但我想不出任何可以过滤掉这样的子字符串的表达式。
我注意到网站上还有其他类似的问题,但每个问题都涉及正则表达式解决方案,这是我目前还不太满意的问题,尤其是在 Python 中。有没有一种简单的方法可以在不使用正则表达式的情况下解决这个问题?如果没有,我们将非常感谢相应的正则表达式以及如何在上面的示例中实现它。
I'm currently running into a bit of a problem. I'm trying to write a program that will highlight occurrences of a word or phrase inside of another string, but only if the string it's being matched to is exactly the same. The part I'm running into troubles with is identifying whether or not the subphrase I'm matching the phrase with is contained within another larger subphrase.
A quick example which shows this problem:
>>> indicators = ["therefore", "for", "since"]
>>> phrase = "... therefore, I conclude I am awesome."
>>> indicators_in_phrase = [indicator for indicator in indicators
if indicator in phrase.lower()]
>>> print indicators_in_phrase
['therefore', 'for']
I do not want 'for' included in that list. I know why it is being included, but I can't think of any expression that could filter out substrings like that.
I've noticed other similar questions on the site, but each involves a Regex solution, which is something I'm not feeling comfortable with yet, especially not in Python. Is there any kind-of-easy way to solve this problem without using a Regex expression? If not, the corresponding Regex expression and how it might be implemented in the above example would be very much appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
有方法可以在不使用正则表达式的情况下完成此操作,但大多数方法都非常复杂,您会希望自己花时间学习所需的简单正则表达式序列。
There are ways to do it without a regex, but most of those ways are so convoluted that you'll wish you had spent the time learning the simple regex sequence that you need for it.
这是正则表达式的一行...
It is one line with regex...
正则表达式是最简单的方法!
提示:
然后你可以改变中间的单词!
编辑:我为你写了这个:
The regex are the simplest way!
Hint:
Then you can change the word in the middle!
EDIT: I wrote this for you:
我认为你想做的更像是这样的:
现在你将在这样的列表中包含单词:
然后像这样比较列表:
可能有几种方法可以使其不那么冗长,但我更喜欢清晰。另外,您可能需要考虑删除“awesome”中的标点符号。和“因此”,
为此请使用 rstrip,如其他答案中所示
I think what you are trying to do is something more like this:
Now you'll have the words in a list like this:
Then compare the lists like so:
There's probably several ways to make this less verbose, but I prefer clarity. Also, you might have to think about removing punctuation as in "awesome." and "therefore,"
For that use rstrip as in the other answer
代码:
干杯:)
Code:
Cheers:)
有点长,但给出了一个想法/当然正则表达式可以让它变得简单
A little lengthy but gives an idea / of course regex is there to make it simple
“for”的问题是它在“therefore”里面还是它不是一个词?例如,如果您的指标之一是“awe”,您是否希望将其包含在 Indicators_in_phrase 中?
您希望如何处理以下情况?
指标 = ["abc", "cde"]
短语=“一abcde二”
Is the problem with "for" that it's inside "therefore" or that it's not a word? For example, if one of your indicators was "awe", would you want it to be included in indicators_in_phrase?
How would you want the following situation to be handled?
indicators = ["abc", "cde"]
phrase = "One abcde two"
您可以从短语中去掉标点符号,然后对其进行拆分,以便所有单词都是单独的。然后你可以进行字符串比较
You can strip off punctuations from your phrase, then do split on it so that all words are individual. Then you can do your string comparison