生成数百个段落的语法分析

发布于 2025-01-06 01:07:47 字数 332 浏览 7 评论 0原文

例如,我记录了 1,000 条客户支持注释条目。

这些注释中的每一个(长度从 25 个字符到 500 个字符的任何内容)都已由用户输入到系统中(用户将创建多个注释),我希望能够生成相当于“语法 KPI”的内容” 通过分析文本。

我想避免对它们进行拼写检查,而是查看基本语法的一致性,例如大写字母和标点符号(如果可能,请使用正确的标点符号)。将每个注释的详细程度纳入到所述“KPI”的输出中也将是一个有趣的转变。

在不沉迷于编程语言的情况下,最有效的方式/方法是什么来创建不是 100% 准确的表示,但足以在这些用户提交的注释中看到语法的异常值?

我没有这样的经验。

谢谢

I have 1,000 (for example) entries of customer support notes that are logged.

Each of these notes (Anything from 25 characters to 500 characters long) have been entered into a system by the user (users will have multiple notes created by them), I'd like to be able to generate an equivalent of a 'grammatical KPI' by analysing the text.

I want to refrain from running a spell-check against them, but rather look at consistency of basic grammar like capital letters and punctuation (correct punctuation if possible). Including the verbosity of each note to factor into the output of said 'KPI' would be an interesting twist too.

Without indulging into programming languages, what would be the most efficient way/method to create not an 100% accurate representation, but enough to see outliers of the grammar in the notes submitted by these users?

I have no experience with anything like this.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

小耗子 2025-01-13 01:07:47

Python 软件基金会总监的这篇演讲实际上是关于从正式文档(专利许可)中提取语义:

http://vimeo.com/53058803

本文介绍了提取情感的技术书面文字:

http://goo.gl/wY9sW

This presentation by the Director of the Python Software Foundation is actually about extracting semantics out of formal documents (patent licenses):

http://vimeo.com/53058803

This paper describes techniques for extracting the sentiment out of written text:

http://goo.gl/wY9sW

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文