为 fuzzywuzzy process.extractOne 设置阈值

发布于 2025-01-09 20:25:25 字数 755 浏览 0 评论 0原文

我目前正在两个不同的零售商之间进行一些字符串产品相似性匹配,我正在使用 fuzzywuzzy process.extractOne function 来找到最好的匹配。

但是,我希望能够设置一个评分阈值,以便仅当分数高于某个阈值时产品才会匹配,因为目前它只是根据最接近的字符串匹配每个产品。

以下代码为我提供了最佳匹配:(当前出现错误)

title,index,score = process.extractOne(text,choices_dict)

然后我尝试使用以下代码来尝试设置阈值:

title,index,score = process.extractOne(text,choices_dict,score_cutoff=80)

这会导致以下TypeError:

TypeError:cannot unpack non-iterable NoneType object

最后,我也尝试了这以下代码:

title,index,scorer,score = process.extractOne(text,choices_dict,scorer = fuzz.token_sort_ratio,score_cutoff = 80)

这会导致以下错误:

ValueError:不够要解压的值(预期为 4,实际为 3)

I'm currently doing some string product similarity matches between two different retailers and I'm using the fuzzywuzzy process.extractOne function to find the best match.

However, I want to be able to set a scoring threshold so that the product will only match if the score is above a certain threshold, because currently it is just matching every single product based on the closest string.

The following code gives me the best match: (currently getting errors)

title, index, score = process.extractOne(text, choices_dict)

I then tried the following code to try set a threshold:

title, index, score = process.extractOne(text, choices_dict, score_cutoff=80)

Which results in the following TypeError:

TypeError: cannot unpack non-iterable NoneType object

Finally, I also tried the following code:

title, index, scorer, score = process.extractOne(text, choices_dict, scorer=fuzz.token_sort_ratio, score_cutoff=80)

Which results in the following error:

ValueError: not enough values to unpack (expected 4, got 3)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

弄潮 2025-01-16 20:25:25

当最佳分数低于 score_cutoff 时,process.extractOne 将返回 None。因此,您要么必须检查 None,要么捕获异常:

best_match = process.extractOne(text, choices_dict, score_cutoff=80)
if best_match:
    value, score, key = best_match
    print(f"best match is {key}:{value} with the similarity {score}")
else:
    print("no match found")

或者

try:
    value, score, key = process.extractOne(text, choices_dict, score_cutoff=80)
    print(f"best match is {key}:{value} with the similarity {score}")
except TypeError:
    print("no match found")

process.extractOne will return None, when the best score is below score_cutoff. So you either have to check for None, or catch the exception:

best_match = process.extractOne(text, choices_dict, score_cutoff=80)
if best_match:
    value, score, key = best_match
    print(f"best match is {key}:{value} with the similarity {score}")
else:
    print("no match found")

or

try:
    value, score, key = process.extractOne(text, choices_dict, score_cutoff=80)
    print(f"best match is {key}:{value} with the similarity {score}")
except TypeError:
    print("no match found")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文