如何在Python中检查字符串是否包含列表中的元素

发布于 2024-11-18 08:20:21 字数 567 浏览 2 评论 0原文

我有这样的事情:

extensionsToCheck = ['.pdf', '.doc', '.xls']

for extension in extensionsToCheck:
    if extension in url_string:
        print(url_string)

我想知道在Python中执行此操作(不使用for循环)更优雅的方法是什么?我正在考虑这样的事情(比如来自 C/C++ ),但它不起作用:

if ('.pdf' or '.doc' or '.xls') in url_string:
    print(url_string)

编辑:我有点被迫解释这与下面的问题有何不同,该问题被标记为潜在的重复(所以我猜它不会关闭)。

区别在于,我想检查一个字符串是否是某个字符串列表的一部分,而另一个问题是检查字符串列表中的字符串是否是另一个字符串的子字符串。类似,但不完全相同,当您在线寻找答案时,语义很重要恕我直言。这两个问题实际上是在寻求解决彼此相反的问题。不过,两者的解决方案是相同的。

I have something like this:

extensionsToCheck = ['.pdf', '.doc', '.xls']

for extension in extensionsToCheck:
    if extension in url_string:
        print(url_string)

I am wondering what would be the more elegant way to do this in Python (without using the for loop)? I was thinking of something like this (like from C/C++), but it didn't work:

if ('.pdf' or '.doc' or '.xls') in url_string:
    print(url_string)

Edit: I'm kinda forced to explain how this is different to the question below which is marked as potential duplicate (so it doesn't get closed I guess).

The difference is, I wanted to check if a string is part of some list of strings whereas the other question is checking whether a string from a list of strings is a substring of another string. Similar, but not quite the same and semantics matter when you're looking for an answer online IMHO. These two questions are actually looking to solve the opposite problem of one another. The solution for both turns out to be the same though.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

无妨# 2024-11-25 08:20:21

将生成器与 any 一起使用,它会在第一个 True 上短路:

if any(ext in url_string for ext in extensionsToCheck):
    print(url_string)

编辑: 我看到这个答案已被 OP 接受。尽管我的解决方案可能是解决他的特定问题的“足够好”的解决方案,并且是检查列表中的任何字符串是否在另一个字符串中找到的通用方法,但请记住,这就是该解决方案所做的全部。 它不关心在哪里找到字符串,例如在字符串的结尾。如果这很重要(就像 url 的情况一样),您应该查看 @Wladimir Palant 的答案,否则您可能会遇到误报的风险。

Use a generator together with any, which short-circuits on the first True:

if any(ext in url_string for ext in extensionsToCheck):
    print(url_string)

EDIT: I see this answer has been accepted by OP. Though my solution may be "good enough" solution to his particular problem, and is a good general way to check if any strings in a list are found in another string, keep in mind that this is all that this solution does. It does not care WHERE the string is found e.g. in the ending of the string. If this is important, as is often the case with urls, you should look to the answer of @Wladimir Palant, or you risk getting false positives.

望笑 2024-11-25 08:20:21
extensionsToCheck = ('.pdf', '.doc', '.xls')

'test.doc'.endswith(extensionsToCheck)   # returns True

'test.jpg'.endswith(extensionsToCheck)   # returns False
extensionsToCheck = ('.pdf', '.doc', '.xls')

'test.doc'.endswith(extensionsToCheck)   # returns True

'test.jpg'.endswith(extensionsToCheck)   # returns False
朕就是辣么酷 2024-11-25 08:20:21

最好正确解析 URL - 这样您就可以处理 http://.../file.doc?foohttp://.../foo.doc/ file.exe 正确。

from urlparse import urlparse
import os
path = urlparse(url_string).path
ext = os.path.splitext(path)[1]
if ext in extensionsToCheck:
  print(url_string)

It is better to parse the URL properly - this way you can handle http://.../file.doc?foo and http://.../foo.doc/file.exe correctly.

from urlparse import urlparse
import os
path = urlparse(url_string).path
ext = os.path.splitext(path)[1]
if ext in extensionsToCheck:
  print(url_string)
浅浅淡淡 2024-11-25 08:20:21

如果您想要单行解决方案,请使用列表推导式。以下代码在具有扩展名 .doc、.pdf 和 .xls 时返回包含 url_string 的列表,在不包含扩展名时返回空列表。

print [url_string for extension in extensionsToCheck if(extension in url_string)]

注意:这只是为了检查它是否包含,并且当想要提取与扩展名匹配的确切单词时没有用。

Use list comprehensions if you want a single line solution. The following code returns a list containing the url_string when it has the extensions .doc, .pdf and .xls or returns empty list when it doesn't contain the extension.

print [url_string for extension in extensionsToCheck if(extension in url_string)]

NOTE: This is only to check if it contains or not and is not useful when one wants to extract the exact word matching the extensions.

城歌 2024-11-25 08:20:21

以防万一有人再次面临这个任务,这是另一个解决方案:

extensionsToCheck = ['.pdf', '.doc', '.xls']
url_string = 'file.doc'
res = [ele for ele in extensionsToCheck if(ele in url_string)]
print(bool(res))
> True

Just in case if anyone will face this task again, here is another solution:

extensionsToCheck = ['.pdf', '.doc', '.xls']
url_string = 'file.doc'
res = [ele for ele in extensionsToCheck if(ele in url_string)]
print(bool(res))
> True
云仙小弟 2024-11-25 08:20:21

这是@psun 给出的列表理解答案的变体。

通过切换输出值,您实际上可以从列表理解中提取匹配模式(这对于 @Lauritz-v-Thaulow 的 any() 方法来说是不可能的)

extensionsToCheck = ['.pdf', '.doc', '.xls']
url_string = 'http://.../foo.doc'

print([extension for extension in extensionsToCheck if(extension in url_string)])

['.doc']`

如果您想在已知匹配模式后收集其他信息,您可以进一步插入正则表达式(当允许的模式列表太长而无法写入单个正则表达式模式时,这可能很有用)

print([re.search(r'(\w+)'+extension, url_string).group(0) for extension in extensionsToCheck if(extension in url_string)])

['foo.doc']

This is a variant of the list comprehension answer given by @psun.

By switching the output value, you can actually extract the matching pattern from the list comprehension (something not possible with the any() approach by @Lauritz-v-Thaulow)

extensionsToCheck = ['.pdf', '.doc', '.xls']
url_string = 'http://.../foo.doc'

print([extension for extension in extensionsToCheck if(extension in url_string)])

['.doc']`

You can furthermore insert a regular expression if you want to collect additional information once the matched pattern is known (this could be useful when the list of allowed patterns is too long to write into a single regex pattern)

print([re.search(r'(\w+)'+extension, url_string).group(0) for extension in extensionsToCheck if(extension in url_string)])

['foo.doc']

倾其所爱 2024-11-25 08:20:21

检查它是否与此正则表达式匹配:

'(\.pdf$|\.doc$|\.xls$)'

注意:如果您的扩展名不在网址末尾,请删除 $ 字符,但它确实会稍微削弱它

Check if it matches this regex:

'(\.pdf$|\.doc$|\.xls$)'

Note: if you extensions are not at the end of the url, remove the $ characters, but it does weaken it slightly

薄凉少年不暖心 2024-11-25 08:20:21

这是我能想象到的最简单的方法:)

list_ = ('.doc', '.txt', '.pdf')
string = 'file.txt'
func = lambda list_, string: any(filter(lambda x: x in string, list_))
func(list_, string)

# Output: True

另外,如果有人需要保存字符串中的元素,他们可以使用这样的东西:

list_ = ('.doc', '.txt', '.pdf')
string = 'file.txt'
func = lambda list_, string: tuple(filter(lambda x: x in string, list_))
func(list_, string)

# Output: '.txt'

This is the easiest way I could imagine :)

list_ = ('.doc', '.txt', '.pdf')
string = 'file.txt'
func = lambda list_, string: any(filter(lambda x: x in string, list_))
func(list_, string)

# Output: True

Also, if someone needs to save elements that are in a string, they can use something like this:

list_ = ('.doc', '.txt', '.pdf')
string = 'file.txt'
func = lambda list_, string: tuple(filter(lambda x: x in string, list_))
func(list_, string)

# Output: '.txt'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文