如何使用lambda进行内部循环?

发布于 2025-02-02 11:42:20 字数 626 浏览 1 评论 0原文

我有list_astring_tmp

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

我想找出string_tmp list_a,如果是,type = l1 else type = l2

# for example
type = ''
for k in string_tmp.split():
    if k in list_a:
        type = 'L1'
if len(type) == 0:
    type = 'L2'

这是真正的问题,但在我的项目中,len(list_a)= 200,000 and len(strgin_tmp)= 10,000,所以我需要 超快

# this is the output of the example 
type = 'L1'

I have list_a and string_tmp like this

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

I want to find out is there any of string_tmp items in the list_a, if it is, type = L1 else type = L2?

# for example
type = ''
for k in string_tmp.split():
    if k in list_a:
        type = 'L1'
if len(type) == 0:
    type = 'L2'

this is the real problem but in my project, len(list_a) = 200,000 and len(strgin_tmp) = 10,000, so I need that to be super fast

# this is the output of the example 
type = 'L1'

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

无敌元气妹 2025-02-09 11:42:20

将参考列表和字符串令牌转换为集合应提高性能。这样的东西:

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

def get_type(s, r): # s is the string, r is the reference list
    s = set(s.split())
    r = set(r)
    return 'L1' if any(map(lambda x: x in r, s)) else 'L2'

print(get_type(string_tmp, list_a))

输出:

L1

Converting the reference list and string tokens to sets should enhance performance. Something like this:

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'

def get_type(s, r): # s is the string, r is the reference list
    s = set(s.split())
    r = set(r)
    return 'L1' if any(map(lambda x: x in r, s)) else 'L2'

print(get_type(string_tmp, list_a))

Output:

L1
梦巷 2025-02-09 11:42:20

使用Regex以及列表理解,我们可以尝试:

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'
output = ['L1' if re.search(r'\b' + x + r'\b', string_tmp) else 'L2' for x in list_a]
print(output)  # ['L1', 'L2', 'L2']

Using regex along with a list comprehension we can try:

list_a = ['AA', 'BB', 'CC']
string_tmp = 'Hi AA How Are You'
output = ['L1' if re.search(r'\b' + x + r'\b', string_tmp) else 'L2' for x in list_a]
print(output)  # ['L1', 'L2', 'L2']
别想她 2025-02-09 11:42:20

效率取决于两个输入中的哪一个是最不变的。例如,如果list_a保持不变,但是您有不同的字符串可以测试,则可能值得将列表变成正则表达式,然后将其用于不同的字符串。

这是一个解决方案,您可以在其中为给定列表创建类的实例。然后反复使用此实例进行不同的字符串:

import re

class Matcher:
    def __init__(self, lst):
        self.regex = re.compile(r"\b(" + "|".join(re.escape(key) for key in lst) + r")\b")

    def typeof(self, s):
        return "L1" if self.regex.search(s) else "L2"

# demo

list_a = ['AA', 'BB', 'CC']

matcher = Matcher(list_a)

string_tmp = 'Hi AA How Are You'
print(matcher.typeof(string_tmp))  # L1

string_tmp = 'Hi DD How Are You'
print(matcher.typeof(string_tmp))  # L2

此正则表达式的副作用是,当它们附近的标点符号时,它也与单词匹配。例如,当字符串为'hi aa时,上面仍然会返回“ L1”,您好吗(带有附加逗号)。

Efficiency depends on which of the two inputs is the most invariant. For instance, if list_a remains the same, but you have different strings to test with, then it may be worth to turn that list into a regular expression and then use it for different strings.

Here is a solution where you create an instance of a class for a given list. Then use this instance repeatedly for different strings:

import re

class Matcher:
    def __init__(self, lst):
        self.regex = re.compile(r"\b(" + "|".join(re.escape(key) for key in lst) + r")\b")

    def typeof(self, s):
        return "L1" if self.regex.search(s) else "L2"

# demo

list_a = ['AA', 'BB', 'CC']

matcher = Matcher(list_a)

string_tmp = 'Hi AA How Are You'
print(matcher.typeof(string_tmp))  # L1

string_tmp = 'Hi DD How Are You'
print(matcher.typeof(string_tmp))  # L2

A side effect of this regular expression is that it also matches words when they have punctuation near them. For instance, the above would still return "L1" when the string is 'Hi AA, How Are You' (with the additional comma).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文