向 Google App Engine 程序添加拼写检查有哪些策略?
我正在开发一个 Google App Engine 程序,该程序需要一些基本的拼写检查功能。通常可以选择 iSpell 或其同类产品,但我不确定这是否适用于 GEA。是否还有其他适合该环境的策略/工具?
I'm working on a Google App Engine program that will require some basic spell checking features. Normally iSpell or it's cousins would be options, but I'm not sure that will work in GEA. Are there other strategies/tools that would work in that environment?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
可以在这里找到一个非常简单的纯 Python 拼写检查器: http://norvig.com/spell- Correct.html
Norvig 用于训练拼写检查器的
big.txt
文件太大,无法以 6.2 兆字节上传到 App Engine,但是NWORDS
字典腌制后的训练结果仅为约 650K。因此,一种解决方案可能是预先训练拼写检查器,腌制结果并将腌制的训练数据包含在您的应用程序中。这个拼写检查器可能不足以满足您的需求,并且我建议您将其集成到您的应用程序中的方式可能是一个绝对糟糕的主意。我真的不确定。不过,尝试一下可能会很有趣。
A very minimal, pure-Python spell checker can be found here: http://norvig.com/spell-correct.html
The
big.txt
file Norvig uses to train his spell checker is too large to upload to App Engine at 6.2 megabytes, but theNWORDS
dict that results from training is only ~650K when pickled. So one solution might be to pre-train the spell checker, pickle the results and include the pickled training data in your application.This spell checker might not be good enough for your needs, and the way I've proposed you integrate it into your app might be an absolutely terrible idea. I'm really not sure. Might be interesting to try, though.
我个人会尝试使用 Google 的 API 进行拼写检查。我现在正在尝试找到它,但我相信他们公开的网络服务包含拼写检查器。
找到真正正在维护的优秀 Python 库总是很困难。另一方面,我认为谷歌的服务应该会存在并可靠一段时间。
不确定结果以什么格式返回,但在您这边,您可以实现自己的 Levenstein 距离公式来查看结果与您所讨论的单词的接近程度。
标记
I personally would try to go down the route of using Google's API for spellcheck. I'm trying to find it now, but I believe their exposed web service includes a spell checker.
It's always tough finding good python libraries that are actually being maintained. On the other hand, I imagine Google's service should be around and dependable for a while.
Not sure in what format the results come back, but on your side, you could implement your own Levenstein distance formula to see how close the results are to your word in question.
Mark