用通俗易懂的语言解释马尔可夫链算法
我不太明白这个马尔可夫......它需要两个单词作为前缀和后缀保存它们的列表并生成随机单词?
/* Copyright (C) 1999 Lucent Technologies */
/* Excerpted from 'The Practice of Programming' */
/* by Brian W. Kernighan and Rob Pike */
#include <time.h>
#include <iostream>
#include <string>
#include <deque>
#include <map>
#include <vector>
using namespace std;
const int NPREF = 2;
const char NONWORD[] = "\n"; // cannot appear as real line: we remove newlines
const int MAXGEN = 10000; // maximum words generated
typedef deque<string> Prefix;
map<Prefix, vector<string> > statetab; // prefix -> suffixes
void build(Prefix&, istream&);
void generate(int nwords);
void add(Prefix&, const string&);
// markov main: markov-chain random text generation
int main(void)
{
int nwords = MAXGEN;
Prefix prefix; // current input prefix
srand(time(NULL));
for (int i = 0; i < NPREF; i++)
add(prefix, NONWORD);
build(prefix, cin);
add(prefix, NONWORD);
generate(nwords);
return 0;
}
// build: read input words, build state table
void build(Prefix& prefix, istream& in)
{
string buf;
while (in >> buf)
add(prefix, buf);
}
// add: add word to suffix deque, update prefix
void add(Prefix& prefix, const string& s)
{
if (prefix.size() == NPREF) {
statetab[prefix].push_back(s);
prefix.pop_front();
}
prefix.push_back(s);
}
// generate: produce output, one word per line
void generate(int nwords)
{
Prefix prefix;
int i;
for (i = 0; i < NPREF; i++)
add(prefix, NONWORD);
for (i = 0; i < nwords; i++) {
vector<string>& suf = statetab[prefix];
const string& w = suf[rand() % suf.size()];
if (w == NONWORD)
break;
cout << w << "\n";
prefix.pop_front(); // advance
prefix.push_back(w);
}
}
I don't quite understand this Markov... it takes two words a prefix and suffix saves up a list of them and makes random word?
/* Copyright (C) 1999 Lucent Technologies */
/* Excerpted from 'The Practice of Programming' */
/* by Brian W. Kernighan and Rob Pike */
#include <time.h>
#include <iostream>
#include <string>
#include <deque>
#include <map>
#include <vector>
using namespace std;
const int NPREF = 2;
const char NONWORD[] = "\n"; // cannot appear as real line: we remove newlines
const int MAXGEN = 10000; // maximum words generated
typedef deque<string> Prefix;
map<Prefix, vector<string> > statetab; // prefix -> suffixes
void build(Prefix&, istream&);
void generate(int nwords);
void add(Prefix&, const string&);
// markov main: markov-chain random text generation
int main(void)
{
int nwords = MAXGEN;
Prefix prefix; // current input prefix
srand(time(NULL));
for (int i = 0; i < NPREF; i++)
add(prefix, NONWORD);
build(prefix, cin);
add(prefix, NONWORD);
generate(nwords);
return 0;
}
// build: read input words, build state table
void build(Prefix& prefix, istream& in)
{
string buf;
while (in >> buf)
add(prefix, buf);
}
// add: add word to suffix deque, update prefix
void add(Prefix& prefix, const string& s)
{
if (prefix.size() == NPREF) {
statetab[prefix].push_back(s);
prefix.pop_front();
}
prefix.push_back(s);
}
// generate: produce output, one word per line
void generate(int nwords)
{
Prefix prefix;
int i;
for (i = 0; i < NPREF; i++)
add(prefix, NONWORD);
for (i = 0; i < nwords; i++) {
vector<string>& suf = statetab[prefix];
const string& w = suf[rand() % suf.size()];
if (w == NONWORD)
break;
cout << w << "\n";
prefix.pop_front(); // advance
prefix.push_back(w);
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据维基百科,马尔可夫链是一个随机过程,其中下一个状态依赖于前一个状态。这有点难以理解,所以我会尝试更好地解释它:
您所看到的似乎是一个生成基于文本的马尔可夫链的程序。本质上,其算法如下:
例如,如果您查看该解决方案的第一句话,您可以得出以下频率表:
本质上,从一种状态到另一种状态的状态转换是基于概率的。在基于文本的马尔可夫链的情况下,转移概率基于所选单词后面的单词的频率。因此,所选单词代表先前的状态,频率表或单词代表(可能的)连续状态。如果您知道先前的状态,您就可以找到连续的状态(这是获得正确频率表的唯一方法),因此这符合连续状态依赖于先前状态的定义。
无耻插件 - 不久前,我用 Perl 编写了一个程序来执行此操作。您可以在此处阅读相关内容。
According to Wikipedia, a Markov Chain is a random process where the next state is dependent on the previous state. This is a little difficult to understand, so I'll try to explain it better:
What you're looking at, seems to be a program that generates a text-based Markov Chain. Essentially the algorithm for that is as follows:
For example, if you look at the very first sentence of this solution, you can come up with the following frequency table:
Essentially, the state transition from one state to another is probability based. In the case of a text-based Markov Chain, the transition probability is based on the frequency of words following the selected word. So the selected word represents the previous state and the frequency table or words represents the (possible) successive states. You find the successive state if you know the previous state (that's the only way you get the right frequency table), so this fits in with the definition where the successive state is dependent on the previous state.
Shameless Plug - I wrote a program to do just this in Perl, some time ago. You can read about it here.
马尔可夫链是状态机,状态转换是概率。
词:鸡;
下一个可能的词:10% - 是; 30% - 曾经; 50% - 腿; 10%——运行;
然后您只需随机选择下一个单词或通过轮盘赌选择即可。您可以从某些输入文本中获得这些概率。
Markov Chains are State Machines with State transitions being probabilities.
Word: Chicken;
possible next Words : 10% - is ; 30% - was; 50% - legs; 10% - runs;
then you simply choose the next word randomly or by some roulette wheel selection. You get these probabilities from some input text.