生成随机确定性有限自动机的算法是什么?
DFA 必须具有以下四个属性:
DFA 有 N 个节点
每个节点有 2 个传出转换。
每个节点都可以从其他每个节点访问。
从所有可能性中以完全均匀的随机性选择 DFA
这就是我到目前为止所做的:
- 从 N 个节点的集合开始。
- 选择一个尚未选择的节点。
- 将其输出连接到另外 2 个随机选择的节点,
- 将一个转换标记为 1,将另一个转换标记为 0。
- 除非已选择所有节点,否则转到 2。
- 确定是否存在没有传入连接的节点。
- 如果是这样,则从具有超过 1 个传入连接的节点窃取传入连接。
- 转到6,除非没有没有传入连接的节点。
但是,这是算法不正确的。考虑该图,其中节点 1 有两个连接到节点 2(反之亦然),而节点 3 有两个连接到节点 4(反之亦然)。类似于:
1 <==> 2
3 <==> 4
其中,通过 <==>我的意思是双向两个传出连接(因此总共 4 个连接)。这似乎形成了 2 个派系,这意味着并非每个状态都可以从其他状态到达。
有谁知道如何完成算法吗?或者,有人知道另一种算法吗?我似乎隐约记得二叉树可以用来构造这个,但我不确定。
The DFA must have the following four properties:
The DFA has N nodes
Each node has 2 outgoing transitions.
Each node is reachable from every other node.
The DFA is chosen with perfectly uniform randomness from all possibilities
This is what I have so far:
- Start with a collection of N nodes.
- Choose a node that has not already been chosen.
- Connect its output to 2 other randomly selected nodes
- Label one transition 1 and the other transition 0.
- Go to 2, unless all nodes have been chosen.
- Determine if there is a node with no incoming connections.
- If so, steal an incoming connection from a node with more than 1 incoming connection.
- Go to 6, unless there are no nodes with no incoming connections
However, this is algorithm is not correct. Consider the graph where node 1 has its two connections going to node 2 (and vice versa), while node 3 has its two connection going to node 4 (and vice versa). That is something like:
1 <==> 2
3 <==> 4
Where, by <==> I mean two outgoing connections both ways (so a total of 4 connections). This seems to form 2 cliques, which means that not every state is reachable from every other state.
Does anyone know how to complete the algorithm? Or, does anyone know another algorithm? I seem to vaguely recall that a binary tree can be used to construct this, but I am not sure about that.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
强连通性是一个困难的约束。让我们生成均匀的随机满射转换函数,然后使用 Tarjan 的线性时间 SCC 算法对其进行测试,直到得到强连接的函数。这个过程有正确的分布,但不清楚它是否有效;我的研究人员的直觉是,强连通性的极限概率小于 1 但大于 0,这意味着预期只需 O(1) 次迭代。
生成满射过渡函数本身就很重要。不幸的是,如果没有这个约束,每个状态都有传入转换的可能性呈指数级增长。使用此问题的答案中描述的算法对包含 N 个部分的 {(1, a), (1, b), (2, a), (2, b), …, (N, a), (N, b)} 的均匀随机分区进行采样。随机排列节点并将它们分配给部分。
例如,令 N = 3 并假设随机划分为
我们选择随机排列 2, 3, 1 并导出转换函数
Strong connectivity is a difficult constraint. Let's generate uniform random surjective transition functions and then test them with e.g. Tarjan's linear-time SCC algorithm until we get one that's strongly connected. This process has the right distribution, but it's not clear that it's efficient; my researcher's intuition is that the limiting probability of strong connectivity is less than 1 but greater than 0, which would imply only O(1) iterations are necessary in expectation.
Generating surjective transition functions is itself nontrivial. Unfortunately, without that constraint it is exponentially unlikely that every state has an incoming transition. Use the algorithm described in the answers to this question to sample a uniform random partition of {(1, a), (1, b), (2, a), (2, b), …, (N, a), (N, b)} with N parts. Permute the nodes randomly and assign them to parts.
For example, let N = 3 and suppose that the random partition is
We choose a random permutation 2, 3, 1 and derive a transition function
接下来,我将使用图论的基本术语。
您可以:
结果将满足所有三个要求。
In what follows I'll use the basic terminology of graph theory.
You could:
The result will satisfy all three requirements.
有一个预期运行时间为 O(n^{3/2}) 的算法。
如果生成具有 m 个顶点的均匀随机有向图,使得每个顶点都有 k 个标记的外弧(k 出有向图),则该有向图中最大的 SCC(强连通分量)的大小很有可能约为 c_k m,其中 c_k 是取决于 k 的常数。实际上,该 SCC 的大小恰好为 c_k m(四舍五入为整数)的概率约为 1/\sqrt{m}。
因此,您可以生成大小为 n/c_k 的均匀随机 2-out 有向图,并检查最大 SCC 的大小。如果它的大小不正好是n,就再试一次,直到成功。所需的预期试验次数为 \sqrt{n}。生成每个有向图应该在 O(n) 时间内完成。因此,算法总共预计运行时间为 O(n^{3/2})。有关更多详细信息,请参阅本文。
There is a expected running time O(n^{3/2}) algorithm.
If you generate a uniform random digraph with m vertices such that each vertex has k labelled out-arcs (a k-out digraph), then with high probability the largest SCC (strongly connected component) in this digraph is of size around c_k m, where c_k is a constant depending on k. Actually, there is about 1/\sqrt{m} probability that the size of this SCC is exactly c_k m (rounded to an integer).
So you can generate a uniform random 2-out digraph of size n/c_k, and check the size of the largest SCC. If its size is not exactly n, just try again until success. The expected number of trials needed is \sqrt{n}. And generating each digraph should be done in O(n) time. So in total the algorithm has expected running time O(n^{3/2}). See this paper for more details.
只需不断增长一组全部可达的节点即可。一旦都可以到达,请填写空白。
集合B中的每个节点都可以到达集合B中的每个节点。只要从集合B中的节点可以到达某个节点并且该节点可以到达集合B中的节点,则可以将其添加到集合中。
Just keep growing a set of nodes which are all reachable. Once they're all reachable, fill in the blanks.
Every node in set B can reach every node in set B. As long as a node can be reached from a node in set B and that node can reach a node in set B, it can be added to the set.
我能想到的最简单的方法是(统一)使用
n
节点和每个节点两个传出的边缘生成一个随机的DFA,忽略其他约束,然后丢弃任何没有强烈连接的(使用牢固连接的组件算法易于测试)。生成均匀的DFA应该是直接的,而无需限制。在性能方面,可能是有问题的一件事是,在找到具有可及性属性的DFA之前,您需要跳过多少个DFA。但是,您应该首先尝试此算法,并查看生成可接受的DFA的最终需要多长时间。The simplest way that I can think of is to (uniformly) generate a random DFA with
N
nodes and two outgoing edges per node, ignoring the other constraints, and then throw away any that are not strongly connected (which is easy to test using a strongly connected components algorithm). Generating uniform DFAs should be straightforward without the reachability constraint. The one thing that could be problematic performance-wise is how many DFAs you would need to skip before you found one with the reachability property. You should try this algorithm first, though, and see how long it ends up taking to generate an acceptable DFA.我们可以从 N 到 2N 之间的随机状态数 N1 开始。
假设初始状态为状态号1。
对于每个状态,对于输入字母表中的每个字符,我们生成一个随机转换(1 到 N1 之间)。
我们从初始状态开始获取连接自动机。我们检查状态的数量,经过几次尝试,我们得到了一个具有 N 个状态的状态。
如果我们也希望有一个最小自动机,则仅保留最终状态的分配,但是随机分配也很有可能获得最小自动机。
We can start with a random number of states N1 between N and 2N.
Assume the initial state the as the state number 1.
For each state, for each character in the input alphabet we generate a random transition (between 1 and N1).
We take the connex automaton starting from the initial state. We check the number of states, and after few tries we get one with N states.
If we wish a minimal automaton too, remains only the assignment of final states, however there are great chances that a random assignment gets a minimal automaton as well.
以下参考文献似乎与您的问题相关:
F. Bassino、J. David 和 C. Nicaud,可能不完整的确定性自动机的枚举和随机生成,纯数学和应用 19(2-3)(2009)1-16。
F. Bassino 和 C. Nicaud。可访问自动机的枚举和随机生成。 理论。比较。 Sc.。 381 (2007) 86-104。
The following references seem to be relevant to your question:
F. Bassino, J. David and C. Nicaud, Enumeration and random generation of possibly incomplete deterministic automata, Pure Mathematics and Applications 19 (2-3) (2009) 1-16.
F. Bassino and C. Nicaud. Enumeration and Random Generation of Accessible Automata. Theor. Comp. Sc.. 381 (2007) 86-104.