带有 Redis 的马尔可夫链

发布于 2024-12-12 08:07:36 字数 689 浏览 0 评论 0原文

出于自学目的,我想实现一个马尔可夫链生成器,使用尽可能多的 Redis,并尽可能少地使用应用程序级逻辑。

假设我想基于历史深度为 N(比如 2)的频率表构建一个单词生成器。

举一个不太有趣的例子,对于两个单词 barbaz 的字典,频率表如下(“.”是终止符,数字是权重)

. . -> b x2
. b -> a x2
b a -> r x1
b a -> z x1
a r -> . x1
a z -> . x1

:生成这个词,我从两个终止符 的历史开始。 .

前两个字母只有一种可能的结果,b a

第三个字母可能是 rz,概率相等,因为它们的权重相等。

第四个字母始终是终止符。

(字典中单词越长,事情会更有趣。)

无论如何,如何使用 Redis 优雅地做到这一点?

Redis 集具有 SRANDMEMBER,但没有权重。

Redis 排序集有权重,但没有随机成员检索。

Redis 列表允许将权重表示为条目副本,但如何与它们进行集合交集?

看起来应用程序代码注定要进行一些数据处理......

For self-education purposes, I want to implement a Markov chain generator, using as much Redis, and as little application-level logic as possible.

Let's say I want to build a word generator, based on frequency table with history depth N (say, 2).

As a not very interesting example, for dictionary of two words bar and baz, the frequency table is as follows ("." is terminator, numbers are weights):

. . -> b x2
. b -> a x2
b a -> r x1
b a -> z x1
a r -> . x1
a z -> . x1

When I generate the word, I start with history of two terminators . .

There is only one possible outcome for the first two letters, b a.

Third letter may be either r or z, with equal probabilities, since their weights are equal.

Fourth letter is always a terminator.

(Things would be more interesting with longer words in dictionary.)

Anyway, how to do this with Redis elegantly?

Redis sets have SRANDMEMBER, but do not have weights.

Redis sorted sets have weights, but do not have random member retrieval.

Redis lists allow to represent weights as entry copies, but how to make set intersections with them?

Looks like application code is doomed to do some data processing...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

野侃 2024-12-19 08:07:36

您可以根据迄今为止所考虑的集合成员的累积概率,通过为每个成员分配 0 到 1 之间的分数来完成对 Redis 排序集的加权随机选择,包括 当前成员。
您使用的顺序无关紧要;您可以选择任何您方便的订单。然后通过生成一个均匀分布在 0 和 1 之间的随机浮点数 r 并调用

ZRANGEBYSCORE zset r 1 LIMIT 0 1,

这将返回分数大于或等于r的第一个元素。
稍加推理就可以让您相信选择成员的概率是正确加权的。

不幸的是,分配给元素的分数需要与累积概率成正比,这一事实似乎使得很难以保留以下重要性的方式使用排序集并集或交集运算:随机选择元素的分数。这部分似乎需要一些重要的应用程序逻辑。

You can accomplish a weighted random selection with a redis sorted set, by assigning each member a score between zero and one, according to the cumulative probability of the members of the set considered thus far, including the current member.
The ordering you use is irrelevant; you may choose any order which is convenient for you. The random selection is then accomplished by generating a random floating point number r uniformly distributed between zero and one, and calling

ZRANGEBYSCORE zset r 1 LIMIT 0 1,

which will return the first element with a score greater than or equal to r.
A little bit of reasoning should convince you that the probability of choosing a member is thus weighted correctly.

Unfortunately, the fact that the scores assigned to the elements needs to be proportional to the cumulative probability would seem to make it difficult to use the sorted set union or intersection operations in a way which would preserve the significance of the scores for random selection of elements. That part would seem to require some significant application logic.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文