当前位置：文江博客话题详情

DFA 最小化测试套件？

发布于 2024-12-27 08:46:22 字数 195 浏览 7 评论 0原文

我正在寻找确定性有限自动机的测试套件，用于测试 DFA 最小化算法的正确性。你能给我一些指点吗？或者是否有可用的算法/实现来生成这样的自动机？

要赢得赏金，您需要提交包含 400 个或更多不同大小和复杂程度的非最小自动机的测试套件，其中至少 20 个包含超过 2000 个节点。

如果这不是问这个问题的正确地方，请引导我到一些更好的地方。谢谢。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

时光礼记 2025-01-03 08:46:22

要测试正确性，您可以尝试将最小 DFA 转换为 OpenFst 格式，并使用等价操作。

回复收藏 0 原文

怪异←思 2025-01-03 08:46:22

测试最多 n 个状态和 m 个字母符号的“所有”DFA 是不可行的。您可以使用已知的最小 DFA 来测试 DFA；要获得（DFA，最小DFA）对，您可以生成随机RE，使用Kleene定理的算法获得NFA-lambda，使用子集构造获得DFA，然后使用已知的正确DFA最小化算法进行最小化（我假设您接受规范算法是正确的）。

编辑：

为了扩展我所说的内容，以下是我尝试生成非最小有限自动机测试套件的方法：

使用 N 操作（串联、并集、Kleene 闭包）生成正则表达式。
使用 Kleene 定理中的算法获得其中包含 O(n) 状态的 NFA-lambda。
使用子集/幂集构造来获取其中包含 O(2^n) 状态的 DFA。
重复此操作，直到找到足够数量且足够复杂的自动机。

生成正则表达式更容易。有一些规则：

a 是 RE 如果 a 是字母符号
(rs) 是 RE 如果 r, s 是 RE
(r+s) 是 RE 如果 r, s 是 RE
(r*) 是 RE 如果r 是 RE
没有其他东西是 RE

要通过 n 次操作获得 RE，可以使用递归方法。

GetRE(ops)
 1. if ops = 0 then return RandomAlphabetSymbol()
 2. select(Rand() % 3)
 3. case 0 then
 4.  ops1 = Rand() % (ops - 1)
 5.  ops2 = (ops - 1) - ops1
 6.  return "(" + GetRE(ops1) + "+" + GetRE(ops2) + ")"
 7. case 1 then
 8.  ops1 = Rand() % (ops - 1)
 9.  ops2 = (ops - 1) - ops1
10.  return "(" + GetRE(ops1) + "." + GetRE(ops2) + ")"
11. case 2 then
12.  return "(" + GetRE(ops - 1) + "*)"

您可能会发现非字符串表示（即分层链接结构，本质上是解析树本身）是应用 Kleene 算法获取 NFA-lambda 的更方便的选择。

Testing "all" DFAs up to n states and m alphabet symbols is infeasible. You could test DFAs with known minimal DFAs; to get (DFA, minimal DFA) pairs, you could generate random REs, get the NFA-lambda usin the algorithm from Kleene's theorem, get a DFA using the subset construction, then minimize with a known correct algorithm for DFA minimization (I assume you accept that the canonical algorithm is correct).

EDIT:

To expand on what I said, here's how I would try to generate a test suite of non-minimal finite automata:

Generate a regular expression using N operations (concatenation, union, Kleene closure).
Use the algorithm from Kleene's theorem to get a NFA-lambda with O(n) states in it.
Use the subset/powerset construction to get a DFA with O(2^n) states in it.
Repeat until you have found a sufficient number of sufficiently complex automata.

Generating the regular expressions is easier. There are a few rules:

a is an RE if a is an alphabet symbol
(rs) is an RE if r, s are REs
(r+s) is an RE if r, s are REs
(r*) is an RE if r is an RE
Nothing else is an RE

To get an RE with n operations, a recursive approach works.

GetRE(ops)
 1. if ops = 0 then return RandomAlphabetSymbol()
 2. select(Rand() % 3)
 3. case 0 then
 4.  ops1 = Rand() % (ops - 1)
 5.  ops2 = (ops - 1) - ops1
 6.  return "(" + GetRE(ops1) + "+" + GetRE(ops2) + ")"
 7. case 1 then
 8.  ops1 = Rand() % (ops - 1)
 9.  ops2 = (ops - 1) - ops1
10.  return "(" + GetRE(ops1) + "." + GetRE(ops2) + ")"
11. case 2 then
12.  return "(" + GetRE(ops - 1) + "*)"

You might find a non-string representation (I.e., a hierarchical linked structure, essentially the parse tree itself) is a more convenient option for applying Kleene's algorithm to get the NFA-lambda.

回复收藏 0 原文

~没有更多了~