检测字符串是否包含“真正的句子”?

发布于 2024-09-19 18:23:34 字数 167 浏览 8 评论 0原文

是否有一些库可以确定给定的字符串是否包含英语的“真正的句子”,这意味着它包含英语单词? (这个句子不需要有意义,但它应该包含真正的英语单词)


例如,以下不是一个句子(至少在英语中:)-

hsgdhjf asdf dsusdf udfhpiew

Is there some library out there that can figure out if a given string of characters contains a "real sentence" in English, meaning that it contains words from English? (The sentence need not make sense, but it should contains real English words)

For example, the following is not a sentence (at least in English:) -

hsgdhjf asdf dsusdf udfhpiew

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

南风起 2024-09-26 18:23:34

这是一个未解决的问题,因为计算机不知道什么是“有意义的”。即使它尝试通过检测名词、动词等来解析句子,仍然存在诸如 "colorless green 之类的短语想法疯狂地睡觉”“水牛城水牛城水牛城水牛城水牛城水牛城水牛城水牛城”会通过的。我怀疑很多人会说这些是句子。

句子的解析方式也有多种,例如“Time flies like an arrow;fruit flies like abanana”可以解析为:

  • 形容词名词动词冠词名词;名词 动词 介词冠词 名词
  • 名词 动词 介词冠词 名词;形容词名词动词冠词名词

只采取两种方式。

底线:解析自然语言很难,而理解它则更难。

This is an unsolved problem, as computers have no idea of what "makes sense". Even if it tries to parse a sentence by detecting nouns, verbs, etc, there are still phrases like "colorless green ideas sleep furiously" or "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo" that would get through. I doubt many people would say those are sentences.

There are also multiple ways of parsing sentences, for example "Time flies like an arrow; fruit flies like a banana" can be parsed as:

  • adjective noun verb article noun; noun verb preposition article noun
  • noun verb preposition article noun; adjective noun verb article noun

to take just two ways.

The bottom line: parsing natural language is hard, and making sense of it is even harder.

蓬勃野心 2024-09-26 18:23:34

您可以使用拼写检查器确保每个单词拼写正确(有许多用于此目的的库,但我没有使用过),但这仍然无法告诉您该句子是否符合语法。此外,说英语的人可能会认为一个句子是“真实的”,即使它有一些错误,并且有些单词不在字典中。

最好的方法仍然是让你的程序向说英语的人显示所谓的句子,并询问他们这是否是“真正的句子”。

You can ensure that every word is spelled correctly using a spelling checker (there are a number of libraries for this, none of which I have used) but that still won't tell you if the sentence is grammatical. Furthermore, an English speaker would probably consider a sentence "real" even if it had some errors, and some words aren't in the dictionary.

The best way to do this remains to have your program show the alleged sentence to a human being who speaks English, and ask them if it is a "real sentence."

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文