Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 10 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(4)
了解您对 Apache Tika 中的版本的看法。这假设您想要找出文本的语言,而不是想要为编程语言构建解析器。
See what you think of the version in Apache Tika. This assumes that you want to find out what language text is in, as opposed to wanting to build a parser for a programming language.
Textcat http://textcat.sourceforge.net/ 没有俄语,但它确实可以处理以下内容:
Textcat http://textcat.sourceforge.net/ doesn't have Russian but it does handle the following:
有 语言检测 API,它通过 HTTP POST 接受文本并返回包含检测到的语言和分数的 JSON。它可以从 Java 或任何其他编程语言中使用。
There is Language Detection API which accepts text via HTTP POST and returns JSON with detected languages and scores. It can be used from Java or any other programming language.
我认为 ANTLR 几乎是标准的。
I think ANTLR is pretty much standard.