清理这个 PHP 马尔可夫链输出?
这是我第一次使用马尔可夫链。
我想结合两个文本源并获得可读的马尔可夫链。我正在使用的实现是此处 - 文本源被剥离标记等 。
我第一次接触马尔可夫链是通过 Ruby Rbot IRC 机器人 他们的马尔可夫插件源位于这里< /a>.
我发现我对 PHP 马尔可夫算法的输出的使用很混乱。我能看到的一件事是 rbot 实现将两个单词链接在一起来开始。有没有一种明确的方法可以通过我链接的 PHP 实现来实现这一点?如果没有,是否有 PHP 实现可以做到这一点?
This is my first time working with Markov chains.
I want to combine two sources of text and get a readable Markov Chain. The implementation I'm using is here - the sources of text are stripped of markup, etc.
I was first exposed to Markov Chains with the Ruby Rbot IRC bot. Their Markov plugin source is here.
I'm finding my use of the PHP markov algorithm's output is messy. One thing I am able to see is that the rbot implementation chains two words together to start. Is there a clear way to make this happen with the PHP implementation I've linked? If not, is there a PHP implementation that can do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您想要进行单词链接还是字母链接?上面的 PHP 实现会进行字母更改,这会导致乱码,而不仅仅是看起来不合适的单词,并且顺序值较低。看起来 rbot 会进行单词链接,这会隐式生成更多“可读”文本。
马尔可夫链的实现非常简单。我认为将 PHP 源代码调整为按单词而不是字母进行拆分和链接不会太难。我一直在考虑制作一个纯sql存储过程,它可以获取一个表并生成一个字符串。
Do you want to do word chaining or letter chaining? The PHP implementation you have above does letter chaning, which will tend towards gibberish, not just words seemingly out of place, at low order values. It looks like the rbot does word chaining, which implicitly generates more 'readable' text.
Markov chaining is pretty simple to implement. I don't think it would be too hard to adapt the PHP source to split and chain by word instead of letter. I've been thinking of making a pure sql stored procedure which can take a table and generate a string.