创建智能文本生成器
我这样做是为了好玩(或者正如 4chan 所说的“为了 teh lolz”),如果我能从中学到一些东西就更好了。大约两年前,我参加了人工智能课程,我真的很喜欢它,但我设法忘记了一切,所以这是一种刷新的方式。
无论如何,我希望能够根据一组输入生成文本。基本上,这将读取论坛输入(或者可能是 Twitter 推文),然后根据学习生成评论。
现在最简单的方法是使用 马尔可夫链文本生成器 但我想要一些东西比这更复杂一点,因为 MKC 基本上只通过词序来学习(给定输入文本,哪个词更有可能出现在词 x 之后)。我正在尝试看看是否可以做一些事情来让它变得更聪明一些。
例如,我希望它做这样的事情:
- 从留言板中的大量帖子中学习,但不要
- 对每个帖子赋予太多权重:
- 从该博文中的其他评论中学习并更好地权衡这些意见
- 生成评论并发布
- 查看其他用户对您的帖子的反应。如果好的,就积极地衡量,这样你就可以发表更多与之前发布的帖子类似的帖子,如果消极的话,反之亦然。
这是我不确定如何实施的权衡和从错误中学习的部分。我想到了人工神经网络(主要是因为我记得很喜欢那一章)但据我所知告诉它主要用于对事物进行分类(即给定一组有限的选择 [x1...xn],其中 x 是给定的输入)并不会真正生成任何内容。
我什至不确定这是否可能,或者这是否是我应该学习/弄清楚的。什么算法最适合这个?
对于那些担心我会使用它作为机器人发送垃圾邮件或提供错误答案的人,我保证我不会使用它来提供(错误的)建议或发送垃圾邮件以获取利润。我绝对不会在SO上发布这些荒谬的想法。我打算用它来供我自己娱乐。
谢谢!
I'm doing this for fun (or as 4chan says "for teh lolz") and if I learn something on the way all the better. I took an AI course almost 2 years ago now and I really enjoyed it but I managed to forget everything so this is a way to refresh that.
Anyway I want to be able to generate text given a set of inputs. Basically this will read forum inputs (or maybe Twitter tweets) and then generate a comment based on the learning.
Now the simplest way would be to use a Markov Chain Text Generator but I want something a little bit more complex than that as the MKC basically only learns by word order (which word is more likely to appear after word x given the input text). I'm trying to see if there's something I can do to make it a little bit more smarter.
For example I want it to do something like this:
- Learn from a large selection of posts in a message board but don't weight it too much
- For each post:
- Learn from the other comments in that post and weigh these inputs higher
- Generate comment and post
- See what other users' reaction to your post was. If good weigh it positively so you make more posts that are similar to the one made, and vice versa if negative.
It's the weighing and learning from mistakes part that I'm not sure how to implement. I thought about Artificial Neural Networks (mainly because I remember enjoying that chapter) but as far as I can tell that's mainly used to classify things (i.e. given a finite set of choices [x1...xn] which x is this given input) not really generate anything.
I'm not even sure if this is possible or if it is what should I go about learning/figuring out. What algorithm is best suited for this?
To those worried that I will use this as a bot to spam or provide bad answers to SO, I promise that I will not use this to provide (bad) advice or to spam for profit. I definitely will not post it's nonsensical thoughts on SO. I plan to use it for my own amusement.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我也在想这样的事情。我认为将语法分析器与马尔可夫链生成器一起使用可以带来显着的改进。然后,MC 可以接受文本短语(动词“drive”通常与宾语“car”一起使用)的训练,并生成语法正确的句子。
I was thinking about something like this, too. I think it could pose a significant improvement to use a grammatical analyzer together with a Markov Chain Generator. Then the MC can be trained on text phrases (verb "drive" often together with object "car") and produce grammatically correct sentences.