The first step, in taking this game to the next level, is ...
...to have a very clear view of prior art!
(and pardon me to say, the question doesn't suggest that you have such an extensive insight into the matter [and you're not alone, count me in ;-)])
Even, and maybe in particular, if your intention is to apply completely novel techniques and models, it seems important to review the literature on current and past practices. Aside from possibly identifying elements that may be adapted or reused in a new implementation, a survey of the domain will provide an keen understanding of the nature of the problem[s].
I've personally tried -on various and multiple occasions!- either the naive approach or the sophomoric approach to tackling broadly-defined problems. With the naive approach, one has but a very slight idea of the true nature and scope of the problem. The sophomoric sees us better equipped with domain knowledge and also with related tools, but this can also be misleading because without a deeper understanding, we tend to mis-read/mis-understand new material offered to us and also misuse some of the tools (a bit like the the fellow who's "good with a hammer" for whom many things look like a nail...)
It is particularly easy to make these mistakes in the field of NLP. That's because
Common sense seems to be all is required: after all a child, who's native tongue is English understands subtleties like
"He's not really an expert"
"He's really not an expert" (small wink at the OP's reference to the ordering of word in the English language)
We live in such exciting times, technology and knowledge wise: Processing power, programming language and tools, mathematical techniques, availability of affordable corpora... to name a few of these things that make this moment in time so special.
Far from me the idea of discouraging you in your chat-bot endeavor, I just hope that this long and generic exposé will encourage to look-before-you-leap, as this will truly save you time in the long run, I think in two ways:
provide you some frames of references (again, even if your intention is to "think outside these boxes")
maybe entice you to redefine the problem, for example by limiting it to particular domains of conversation (sports, or health, or life at a particular university campus...) or by focusing on a particular aspect of the problem (semantic awareness, smooth, natural sounding grammar, use of colloquial forms...)
Check out MegaHAL's implementation for some ideas. We've used a variant of this bot for ages in an IRC channel of ours, and he does on occasion appear to be the intelligent mixture of many of our dominant personalities.
You "train" the bot -
each time the bot answer, you rank (or the tester) the answer - if the answer is good/logical - give high rank, if the answer is bad... low/negative rank.
use the ranking in the future to choose the answer, and this is how the bot learns...
发布评论
评论(6)
第一步,将这个游戏提升到一个新的水平,是......
......对现有技术有一个非常清晰的认识!
(请原谅我说,这个问题并不表明你对此事有如此广泛的洞察力[而且你并不孤单,算我一个;-)])
甚至,也许特别是,如果你的意图是要应用全新的技术和模型,回顾当前和过去实践的文献似乎很重要。除了可能识别出可以在新实现中调整或重用的元素之外,对该领域的调查还将提供对问题的性质的敏锐理解。
我亲自在各种不同的场合尝试过!-无论是简单的方法还是幼稚的方法来解决广泛定义的问题。如果采用天真的方法,人们对问题的真实性质和范围只有非常轻微的了解。二年级学生认为我们更好地配备了领域知识和相关工具,但这也可能会产生误导,因为如果没有更深入的理解,我们往往会误读/误解提供给我们的新材料,并滥用一些工具(有点像那些“擅长用锤子”的人,对他来说很多东西看起来都是钉子……)
在 NLP 领域特别容易犯这些错误。那是因为
之类的微妙之处
“他并不是真正的专家”
“他真的不是专家”
(对OP提到英语单词顺序的小眨眼)
我绝不会阻止你在聊天机器人方面的努力,我只是希望这篇长篇大论的揭露能够鼓励你三思而后行,因为从长远来看,这将真正节省你的时间,我认为有两种方法:
祝你好运;-)
The first step, in taking this game to the next level, is ...
...to have a very clear view of prior art!
(and pardon me to say, the question doesn't suggest that you have such an extensive insight into the matter [and you're not alone, count me in ;-)])
Even, and maybe in particular, if your intention is to apply completely novel techniques and models, it seems important to review the literature on current and past practices. Aside from possibly identifying elements that may be adapted or reused in a new implementation, a survey of the domain will provide an keen understanding of the nature of the problem[s].
I've personally tried -on various and multiple occasions!- either the naive approach or the sophomoric approach to tackling broadly-defined problems. With the naive approach, one has but a very slight idea of the true nature and scope of the problem. The sophomoric sees us better equipped with domain knowledge and also with related tools, but this can also be misleading because without a deeper understanding, we tend to mis-read/mis-understand new material offered to us and also misuse some of the tools (a bit like the the fellow who's "good with a hammer" for whom many things look like a nail...)
It is particularly easy to make these mistakes in the field of NLP. That's because
"He's not really an expert"
"He's really not an expert"
(small wink at the OP's reference to the ordering of word in the English language)
Far from me the idea of discouraging you in your chat-bot endeavor, I just hope that this long and generic exposé will encourage to look-before-you-leap, as this will truly save you time in the long run, I think in two ways:
Good luck ;-)
查看MegaHAL 的实现了解一些想法。我们已经在我们的 IRC 频道中使用这个机器人的一个变体很多年了,他有时看起来确实是我们许多主导人物的智能混合体。
Check out MegaHAL's implementation for some ideas. We've used a variant of this bot for ages in an IRC channel of ours, and he does on occasion appear to be the intelligent mixture of many of our dominant personalities.
你“训练”机器人 -
每次机器人回答时,您(或测试人员)都会对答案进行排名 - 如果答案好/合乎逻辑 - 给出高排名,如果答案不好......低/负排名。
使用未来的排名来选择答案,这就是机器人学习的方式......
You "train" the bot -
each time the bot answer, you rank (or the tester) the answer - if the answer is good/logical - give high rank, if the answer is bad... low/negative rank.
use the ranking in the future to choose the answer, and this is how the bot learns...
Eliza 的精彩描述paip.html" rel="nofollow noreferrer">人工智能编程范式。您应该能够在几天的时间内实现一个简单的 Eliza 机器人。
这不是一种学习算法,但令人惊讶的是,如此简单的事情却能得出如此真实的答案。
There's a great description of Eliza in Paradigms of AI Programming. You should be able to implement a simple Eliza bot in a few days of work.
This isn't a learning algorithm, but it's surprising how realistic answers can be from something so simple.
您可以在 BOT libre http://www.botlibre.com 上创建自己的聊天机器人。
机器人可以学习、可以训练、可以编写脚本,您可以对它们进行编程,或者让它们自行编程。
该网站支持将您的机器人嵌入您自己的网站,具有 REST API 访问权限、Android、IRC、Twitter。免费托管,即使是商业机器人也是如此。
You can create your own chat bot on BOT libre, http://www.botlibre.com.
The bots learns, can be trained, can be scripted, and your can program them, or let them program themselves.
Thew site supports embedding your bot on your own site, has REST API access, Android, IRC, Twitter. Free hosting, even for commercial bots.
AIML 来自 AliceBot 项目可能会帮助您。这是它所关注的 AI 分支的完整 XML 模式(如果这不会让您失望的话)。
维基百科的一个示例:
RebbeccaAIML 是一个记录良好的实现。
AIML from the AliceBot project may help you out. It's a whole XML schema (if that doesn't put you off) for the branch of AI its concerned with.
An example from Wikipedia:
RebbeccaAIML is one quite well documented implementation.