检测字符串是否包含“真正的句子”?
是否有一些库可以确定给定的字符串是否包含英语的“真正的句子”,这意味着它包含英语单词? (这个句子不需要有意义,但它应该包含真正的英语单词)
例如,以下不是一个句子(至少在英语中:)-
hsgdhjf asdf dsusdf udfhpiew
Is there some library out there that can figure out if a given string of characters contains a "real sentence" in English, meaning that it contains words from English? (The sentence need not make sense, but it should contains real English words)
For example, the following is not a sentence (at least in English:) -
hsgdhjf asdf dsusdf udfhpiew
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个未解决的问题,因为计算机不知道什么是“有意义的”。即使它尝试通过检测名词、动词等来解析句子,仍然存在诸如 "colorless green 之类的短语想法疯狂地睡觉”或“水牛城水牛城水牛城水牛城水牛城水牛城水牛城水牛城”会通过的。我怀疑很多人会说这些是句子。
句子的解析方式也有多种,例如“Time flies like an arrow;fruit flies like abanana”可以解析为:
只采取两种方式。
底线:解析自然语言很难,而理解它则更难。
This is an unsolved problem, as computers have no idea of what "makes sense". Even if it tries to parse a sentence by detecting nouns, verbs, etc, there are still phrases like "colorless green ideas sleep furiously" or "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo" that would get through. I doubt many people would say those are sentences.
There are also multiple ways of parsing sentences, for example "Time flies like an arrow; fruit flies like a banana" can be parsed as:
to take just two ways.
The bottom line: parsing natural language is hard, and making sense of it is even harder.
您可以使用拼写检查器确保每个单词拼写正确(有许多用于此目的的库,但我没有使用过),但这仍然无法告诉您该句子是否符合语法。此外,说英语的人可能会认为一个句子是“真实的”,即使它有一些错误,并且有些单词不在字典中。
最好的方法仍然是让你的程序向说英语的人显示所谓的句子,并询问他们这是否是“真正的句子”。
You can ensure that every word is spelled correctly using a spelling checker (there are a number of libraries for this, none of which I have used) but that still won't tell you if the sentence is grammatical. Furthermore, an English speaker would probably consider a sentence "real" even if it had some errors, and some words aren't in the dictionary.
The best way to do this remains to have your program show the alleged sentence to a human being who speaks English, and ask them if it is a "real sentence."