安卓&模糊匹配、n-gram 和 Levenshtein 距离

发布于 2024-10-19 09:28:27 字数 406 浏览 12 评论 0原文

我正在构建一个 Android 应用程序，它接受字符串输入并使用 Google API 返回书籍的排名列表。

我正在寻找一种方法来将用户输入的开放式字符串与列表中的第一项进行比较，以查看他们输入的内容是否“可能”是一本书。我有大量关于这本书、标题、作者、描述等的信息，所以我可以在任何部分进行搜索。

一个例子是：

'eyre affair fforde', 'fforde eyre affair', 'the eyre affair'
----> 
'Likely' to be 'The Eyre Affair by Jasper Fforde'

解决这个问题的最佳方法是什么？我已经研究过编辑距离，但认为它不适用于这种开放式输入，n-grams 似乎是一个好方法，或者模糊匹配。

还有其他想法吗？

原文

I am building an Android app which takes a string input and returns a ranked list of books using the Google API.

I am looking for a way to compare the open ended string that the user enters, with the first item in the list to see if what they entered is 'likely' to be one book. I have loads of information about the book, title, author, description etc so I can search in any part.

An example is:

'eyre affair fforde', 'fforde eyre affair', 'the eyre affair'
----> 
'Likely' to be 'The Eyre Affair by Jasper Fforde'

What would be the best way to go about this? I have looked at levenshtein distance but don't think it would work with such open ended input, n-grams seem a good way to go, or fuzzy matching.

Any other ideas?

分享到QQ

分享到微博