更改单词并保持其含义不变
我们有一个要求,我们需要更改句子中的单词或短语,同时保持其含义完整。该应用程序将为参与文案写作的用户提供建议。
我不知道应该从哪里开始...我们还没有最终确定该技术,但希望在 Python 或 .Net 中实现。
We have a requirement in which we need to change change the words or phrases in the sentence while keeping its meaning intact. This application is going to provide suggestions to users who are involved in copy-writing.
I don't know where should I start... we have not yet finalized the technology but would like to do it in a Python or in .Net.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
只是为了笑:
运行
收益
Just for laughs:
Running
yields
在Python中使用nltk。访问词性标记和词网,这两者都是进行合理替换所必需的。
http://www.nltk.org/
Use nltk in python. Access to part-of-speech tagging and wordnet, both of which will be necessary to make reasonable substitutions.
http://www.nltk.org/
同义词和马尔可夫链的某种组合可能会起作用,但你总是会得到奇怪的结果。不要指望程序能比人类说出更好的短语。
Some combination of synonyms and Markov chains might work, but you'll always get strange results. Don't expect a program to make better phrases than humans.
如果您正在寻找计算机辅助,其中软件提供解决方案或部分解决方案的建议,我认为为每个句子中的内容词提供自动同义词库查找将是一个好的开始。只需使用停用词列表来过滤掉不感兴趣的单词。 翻译记忆是一个相关的概念,我确信NLP用于辅助翻译您可以从中获得有关用户界面等的想法。有多种可用的开源解决方案。
如果你想要一个完全无监督的过程,我认为解析一些语义表示并改变一些基于 WordNet 的内容词,然后从中生成可能是理论上最干净的方法。如果只有语法重组是可以的,那么就放弃改变。然而,质量很可能会很低。如果您只需要在一个狭窄的领域使用它,则可以进行大量的定制,并获得相当好的结果。
If you are looking for computer aid, where the software provides suggestions for solutions or part of solutions, I think to provide automated thesaurus lookup for the content words in each sentence would be a good start. Just use a stop-word list to filter out uninteresting words. Translation Memory is a related concept, where NLP is used to aid in translation, I'm sure you can get ideas for the user-interface etc. from this. There are several open source solutions available.
If you want a totally unsupervised process, I think parsing into some semantic representation and changing some content words based on WordNet for example, and then generate from this is perhaps the theoretically cleanest approach. If only grammatical restructuring is okay then drop the changing. The quality, however, will most probably be low. If you only need this for a narrow field, it is possible to do a lot of tailoring, and making quite good results possible.