在字符串列表中减少字符串匹配时间
我有一个字符串列表,该字符串大约有100K条目,这可能会在以后增加。如果每个输入,我都必须处理此列表以找到确切的匹配。
usr_input = "find_word"
check_list = ["first_word", "second_word"] # around 100k entry
# What I am doing right now
if usr_input in check_list:
print("Found word in list")
现在,对于较小的数据集来说,这很好。但是,随着尺寸增加到100k,我看到它会给我的应用程序造成损失。当我们有很多进入处理时,响应时间更改为〜1分钟。
有什么方法可以优化此操作。
I have a list of string which has around 100k entry which might increase in future. In case of every input I have to process this list to find exact match.
usr_input = "find_word"
check_list = ["first_word", "second_word"] # around 100k entry
# What I am doing right now
if usr_input in check_list:
print("Found word in list")
Now this works fine for smaller dataset. But as size increased to 100k I am seeing it taking a toll on my application. And response time changed to ~1min sometime when we've lot of entry to process.
Is there any way to optimize this operation.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用
SET
而不是list
一个选项IE,如果字符串仅出现一次或多次,这一点重要吗?由于它使用了哈希,因此操作效率要高得多。Is using a
set
instead of alist
an option i.e. is it important if strings appear only once or multiple times? Since it uses hashing, the operation is a lot more efficient.