撇号转换为正确的文本?

发布于 2024-10-11 20:54:01 字数 336 浏览 3 评论 0原文

目标:我需要能够将撇号转换为正确形成的单词。 - 至少对于带撇号的最常见单词而言。为了理想地做到这一点,我需要一个单词列表及其隐含的对应部分(即“不”和“不”)。

问题:我正在创建基于自然语言处理的搜索算法,但是当用户使用撇号创建内容(或搜索)时,会给我们带来问题。主要是因为如果我们简单地删除撇号,我们就会得到 (don't -> dont) (doesn't -> isnt),它正式不是一个英语单词,并且不能被 NLP 系统翻译。

理想的解决方案只是对这些项目应转换为的内容进行一对一的映射,但我不知道这样的列表。

如果您知道其中之一,请告诉我,以及我在哪里可以找到它。

谢谢

Goal: I need to be able to convert apostrophes to properly formed words. - at least for the most common words with apostrophes. To do this ideally I'd want a list of words and their implied conterparts (i.e. "don't" and "do not").

Issue: I'm creating a search algorithm based on natural language processing, but when users create content (or search) using an apostrophe, it causes issues for us. Mostly because if we were to simply remove the apostrophe we would have (don't -> dont) (doesn't -> doesnt), which officially is not an english word, and can't be translated by the NLP system.

The ideal solution is simply a one to one mapping of what these items should be converted to, but I'm unaware of such a list.

Please let me know if you know of one, and where I might be able to find it.

thx

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

别闹i 2024-10-18 20:54:01

这看起来是一个非常好的列表:
http://www.textfixer.com/resources/english-contractions-list.php

取决于你想让你的系统有多好。它是否会理解“gonna”是“going to”而“gotta”是……嗯,这是一个困难的问题。它可能意味着“必须”(“必须”、“必须”)或“得到一个”(“有一个”)。

哦,当我们尝试教计算机进行通信时,我们学到了一些东西。

This looks like a pretty good list:
http://www.textfixer.com/resources/english-contractions-list.php

Depends on how good you want to make your system. Is it going to understand that "gonna" is "going to" and "gotta" is ... well, that's a tough one. It could mean "got to" ("have to", "must"), or "got a" ("have a").

Oh, the things we learn when we try to teach our computers to communicate.

羞稚 2024-10-18 20:54:01

这些词称为“缩写”,您可以在网上找到一个列表,例如 http:// /en.wikipedia.org/wiki/Contraction_(语法)

These words are called "contractions" and you can find a list on the web, e.g. http://en.wikipedia.org/wiki/Contraction_(grammar)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文