优化查找以任何顺序与子字符串中的字符匹配的字符串?

发布于 2025-01-30 03:33:55 字数 1067 浏览 5 评论 0原文

假设列表如下:

list_of_strings = ['foo', 'bar', 'soap', 'sseo', 'spaseo', 'oess']

和一个子字符串,

to_find = 'seos'

我想在list_of_strings中找到字符串:

  1. 具有与to_find具有相同长度
  2. 的字符to_find(不退缩字符的顺序)

list_of_strings的输出应为'sseo','oess'](因为它具有来自to_find&amp&的所有字母都有4个长度

:我

import itertools
list_of_strings = [string for string in list_of_strings if len(string) == len(to_find)]
result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]

要找到我所做的代码

import timeit
timeit.timeit("[string for string in list_of_strings if any(''.join(perm) in string for perm in itertools.permutations(to_find))]", 
              setup='from __main__ import list_of_strings, to_find', number=100000)

需要多长时间才能提供输出。我猜这是因为使用itertools.permutations

有没有办法使该代码更加高效?

谢谢

Assuming a list as follows:

list_of_strings = ['foo', 'bar', 'soap', 'sseo', 'spaseo', 'oess']

and a sub string

to_find = 'seos'

I would like to find the string(s) in the list_of_strings that:

  1. Have the same length as to_find
  2. Have the same characters as to_find (irresepective of the order of the characters)

The output from the list_of_strings should be 'sseo', 'oess'] (since it has all the letters from to_find & all have a length of 4)

I have:

import itertools
list_of_strings = [string for string in list_of_strings if len(string) == len(to_find)]
result = [string for string in list_of_strings if any("".join(perm) in string for perm in itertools.permutations(to_find))]

To find how long does it take to run the code I did

import timeit
timeit.timeit("[string for string in list_of_strings if any(''.join(perm) in string for perm in itertools.permutations(to_find))]", 
              setup='from __main__ import list_of_strings, to_find', number=100000)

The process takes a while to give the output. I am guessing it is because of the use of itertools.permutations.

Is there a way I can make this code more efficient?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

唱一曲作罢 2025-02-06 03:33:55

如果订单没关系,则可以对字符串进行分类并比较结果列表:

list_of_strings = ['foo', 'bar', 'soap', 'sseo', 'spaseo', 'oess']
to_find = sorted('seos')
matches = [word for word in list_of_strings if sorted(word) == to_find]

If order doesn't matter, you can just sort the strings and compare the resulting lists:

list_of_strings = ['foo', 'bar', 'soap', 'sseo', 'spaseo', 'oess']
to_find = sorted('seos')
matches = [word for word in list_of_strings if sorted(word) == to_find]
月寒剑心 2025-02-06 03:33:55

这应该起作用,因为counter创建一个类似dict的式,它计算每个字符串中字符的数量,目的是匹配字母及其计数,无论其订单如何。

from collections import Counter
to_find_counter = Counter(to_find)
# go through the list and check if the Counter is the same as the Counter of to_find
[x for x in list_of_strings if Counter(x)==to_find_counter]
['sseo', 'oess']

This should work because Counter creates a dict-like that counts the number of characters in each string and the aim is to match the letters and their counts irrespective of their orders.

from collections import Counter
to_find_counter = Counter(to_find)
# go through the list and check if the Counter is the same as the Counter of to_find
[x for x in list_of_strings if Counter(x)==to_find_counter]
['sseo', 'oess']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文