Python 搜索字符串的逻辑

发布于 2024-10-19 13:19:57 字数 394 浏览 1 评论 0原文

filtered=[]
text="any.pdf"
if "doc" and "pdf" and "xls" and "jpg" not in text:
    filtered.append(text)
print(filtered)

这是我在 Stack Overflow 上的第一篇文章,所以如果问题中有一些烦人的东西,请原谅,如果文本不包含以下任何单词:doc、pdf、xls、jpg,代码应该附加文本。 如果它像这样,它就可以正常工作:

if "doc" in text:
elif "jpg" in text:
elif "pdf" in text:
elif "xls" in text:
else:
    filtered.append(text)
filtered=[]
text="any.pdf"
if "doc" and "pdf" and "xls" and "jpg" not in text:
    filtered.append(text)
print(filtered)

This is my first Post in Stack Overflow, so excuse if there's something annoying in Question, The Code suppose to append text if text doesn't include any of these words:doc,pdf,xls,jpg.
It works fine if Its like:

if "doc" in text:
elif "jpg" in text:
elif "pdf" in text:
elif "xls" in text:
else:
    filtered.append(text)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

落叶缤纷 2024-10-26 13:19:57

如果你打开Python解释器,你会发现“doc”和“pdf”和“xls”和“jpg”'jpg'是一样的:

>>> "doc" and "pdf" and "xls" and "jpg"
'jpg'

因此,您的第一次尝试仅针对“jpg”进行测试,而不是针对所有字符串进行测试。

有很多种方法可以做你想做的事。下面的内容不是最明显的,但它很有用:

if not any(test_string in text for test_string in ["doc", "pdf", "xls", "jpg"]):
    filtered.append(text)

另一种方法是将 for 循环与 else 语句结合使用:

for test_string in ["doc", "pdf", "xls", "jpg"]:
    if test_string in text:
        break
else: 
    filtered.append(text)

最后,您可以使用纯列表理解:

tofilter = ["one.pdf", "two.txt", "three.jpg", "four.png"]
test_strings = ["doc", "pdf", "xls", "jpg"]
filtered = [s for s in tofilter if not any(t in s for t in test_strings)]

编辑

如果您想过滤单词和扩展名,我会推荐以下内容:

text_list = generate_text_list() # or whatever you do to get a text sequence
extensions = ['.doc', '.pdf', '.xls', '.jpg']
words = ['some', 'words', 'to', 'filter']
text_list = [text for text in text_list if not text.endswith(tuple(extensions))]
text_list = [text for text in text_list if not any(word in text for word in words)]

这仍然可能导致一些不匹配;上面还过滤了“做某事”、“他是一个文字大师”等。如果这是一个问题,那么您可能需要更复杂的解决方案。

If you open up a python interpreter, you'll find that "doc" and "pdf" and "xls" and "jpg" is the same thing as 'jpg':

>>> "doc" and "pdf" and "xls" and "jpg"
'jpg'

So rather than testing against all the strings, your first attempt only tests against 'jpg'.

There are a number of ways to do what you want. The below isn't the most obvious, but it's useful:

if not any(test_string in text for test_string in ["doc", "pdf", "xls", "jpg"]):
    filtered.append(text)

Another approach would be to use a for loop in conjunction with an else statement:

for test_string in ["doc", "pdf", "xls", "jpg"]:
    if test_string in text:
        break
else: 
    filtered.append(text)

Finally, you could use a pure list comprehension:

tofilter = ["one.pdf", "two.txt", "three.jpg", "four.png"]
test_strings = ["doc", "pdf", "xls", "jpg"]
filtered = [s for s in tofilter if not any(t in s for t in test_strings)]

EDIT:

If you want to filter both words and extensions, I would recommend the following:

text_list = generate_text_list() # or whatever you do to get a text sequence
extensions = ['.doc', '.pdf', '.xls', '.jpg']
words = ['some', 'words', 'to', 'filter']
text_list = [text for text in text_list if not text.endswith(tuple(extensions))]
text_list = [text for text in text_list if not any(word in text for word in words)]

This could still lead to some mismatches; the above also filters "Do something", "He's a wordsmith", etc. If that's a problem then you may need a more complex solution.

新人笑 2024-10-26 13:19:57

如果这些扩展名始终位于末尾,则可以使用 .endswith并且可以解析元组。

if not text.endswith(("doc", "pdf", "xls", "jpg")):
    filtered.append(text)

If those extensions are always at the end, you can use .endswith and that can parse tuple.

if not text.endswith(("doc", "pdf", "xls", "jpg")):
    filtered.append(text)
心是晴朗的。 2024-10-26 13:19:57
basename, ext = os.path.splitext(some_filename)
if not ext in ('.pdf', '.png'):
   filtered.append(some_filename)
....
basename, ext = os.path.splitext(some_filename)
if not ext in ('.pdf', '.png'):
   filtered.append(some_filename)
....
糖粟与秋泊 2024-10-26 13:19:57

请尝试以下操作:

if all(substring not in text for substring in ['doc', 'pdf', 'xls', 'jpg']):
     filtered.append(text)

Try the following:

if all(substring not in text for substring in ['doc', 'pdf', 'xls', 'jpg']):
     filtered.append(text)
微暖i 2024-10-26 13:19:57

当前选择的答案非常好,它解释了执行您想要做的事情的语法上正确的方法。然而,很明显,您正在处理出现在结尾的文件扩展名[失败:doctor_no.pywhatsupdoc],并且可能您使用的是 Windows,其中文件路径不存在大小写区别 [失败:FUBAR.DOC]。

为了覆盖这些基础:

# setup
import os.path
interesting_extensions = set("." + x for x in "doc pdf xls jpg".split())

# each time around
basename, ext = os.path.splitext(text)
if ext.lower() not in interesting_extensions:
    filtered.append(text)

The currently-selected answer is very good as far as explaining the syntactically correct ways to do what you want to do. However it's obvious that you are dealing with file extensions, which appear at the end [fail: doctor_no.py, whatsupdoc], and probable that you are using Windows, where case distinctions in file paths don't exist [fail: FUBAR.DOC].

To cover those bases:

# setup
import os.path
interesting_extensions = set("." + x for x in "doc pdf xls jpg".split())

# each time around
basename, ext = os.path.splitext(text)
if ext.lower() not in interesting_extensions:
    filtered.append(text)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文