MySQL函数用于检查两个文本之间的相似度百分比

发布于 2024-12-09 15:27:41 字数 498 浏览 2 评论 0原文

我需要 MySQL 代码来检查通过表单提交的文本与 MySQL 数据库中存储的大量文本之间的相似性百分比。

我正在寻找像 PHP 的 similar_text() 一样工作的 MySQL 存储过程 函数。已经有 MySQL Levenshtein 距离程序，但还不够。

当用户提交文本时，算法应返回数据库中与提交的文本具有给定相似度百分比的任何条目（它将仅比较数据库中的一列），例如，返回数据库中相似度> 1 的所有条目。 40%是用户提交的文本。

例如，表

TABLE - Articles
id, article_body, article_title

代码应返回相似性百分比> 的所有行。 40%（或其他给定值）与用户提交的文本（article_body）。

原文

I need MySQL code for checking similarity percentage between text submitted via form against a number of texts stored in MySQL database.

I am looking for MySQL stored procedure that will work the like PHP's similar_text() function. There is already MySQL Levenshtein distance procedure but it's not sufficient.

When the user submits the text the algorithm should return any entry in database with given percentage of similarity to the text submitted (it will compare only one column in database), e.g return all entries from database that have similarity > 40% with the text submitted by the user.

E.g table

TABLE - Articles
id, article_body, article_title

Code should return all rows that have similarity percentage > 40% (or other given value) with the text (article_body) the user have submitted.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

病毒体 2024-12-16 15:27:41

我会在应用程序中这样做。

也许 SOUNDEX 函数的结果会对您有所帮助 -

SELECT SOUNDEX('Hello'), SOUNDEX('Hello world'), SOUNDEX('hellboy');
+------------------+------------------------+--------------------+
| SOUNDEX('Hello') | SOUNDEX('Hello world') | SOUNDEX('hellboy') |
+------------------+------------------------+--------------------+
| H400             | H4643                  | H410               |
+------------------+------------------------+--------------------+

I'd do it in the application.

Maybe result of SOUNDEX function will help you -

SELECT SOUNDEX('Hello'), SOUNDEX('Hello world'), SOUNDEX('hellboy');
+------------------+------------------------+--------------------+
| SOUNDEX('Hello') | SOUNDEX('Hello world') | SOUNDEX('hellboy') |
+------------------+------------------------+--------------------+
| H400             | H4643                  | H410               |
+------------------+------------------------+--------------------+

回复收藏 0 原文