测试大型状态机有哪些策略?

发布于 2024-08-30 01:19:58 字数 1311 浏览 13 评论 0 原文

我继承了一个庞大且相当复杂的状态机。它有 31 种可能的状态,所有这些都是真正需要的(大型业务流程)。它具有以下输入:

  • Enum:当前状态(因此 0 -> 30)
  • Enum:源(当前只有 2 个条目)
  • Boolean:请求
  • Boolean:类型
  • Enum:状态(3 个状态)
  • Enum:处理(3 个状态)
  • Boolean:已完成

将其分解为单独的状态机似乎并不可行,因为每个状态都是不同的。我为最常见的输入编写了测试,每个输入一个测试,所有输入不变,除了状态。

[Subject("Application Process States")]
public class When_state_is_meeting2Requested : AppProcessBase
{
    Establish context = () =>
    {
        //Setup....
    };

    Because of = () => process.Load(jas, vac);

    It Current_node_should_be_meeting2Requested = () => process.CurrentNode.ShouldBeOfType<meetingRequestedNode>();
    It Can_move_to_clientDeclined = () => Check(process, process.clientDeclined);
    It Can_move_to_meeting1Arranged = () => Check(process, process.meeting1Arranged);
    It Can_move_to_meeting2Arranged = () => Check(process, process.meeting2Arranged);
    It Can_move_to_Reject = () => Check(process, process.Reject);
    It Cannot_move_to_any_other_state = () => AllOthersFalse(process);
}

没有人完全确定每种状态和一组输入的输出应该是什么。我已经开始为它编写测试。但是,我需要编写类似 4320 测试(30 * 2 * 2 * 2 * 3 * 3 * 2)的内容。

您对测试状态机有什么建议?


编辑:我正在研究所有建议,当我找到最有效的建议时,我会标记一个答案。

I inherited a large and fairly complex state machine. It has 31 possible states, all are really needed (big business process). It has the following inputs:

  • Enum: Current State (so 0 -> 30)
  • Enum: source (currently only 2 entries)
  • Boolean: Request
  • Boolean: Type
  • Enum: Status (3 states)
  • Enum: Handling (3 states)
  • Boolean: Completed

Breaking it into separate state machines doesn't seem feasible, as each state is distinct. I wrote tests for the most common inputs, with one test per input, all inputs constant, except for the State.

[Subject("Application Process States")]
public class When_state_is_meeting2Requested : AppProcessBase
{
    Establish context = () =>
    {
        //Setup....
    };

    Because of = () => process.Load(jas, vac);

    It Current_node_should_be_meeting2Requested = () => process.CurrentNode.ShouldBeOfType<meetingRequestedNode>();
    It Can_move_to_clientDeclined = () => Check(process, process.clientDeclined);
    It Can_move_to_meeting1Arranged = () => Check(process, process.meeting1Arranged);
    It Can_move_to_meeting2Arranged = () => Check(process, process.meeting2Arranged);
    It Can_move_to_Reject = () => Check(process, process.Reject);
    It Cannot_move_to_any_other_state = () => AllOthersFalse(process);
}

No one is entirely sure what the output should be for each state and set of inputs. I have started to write tests for it. However, I'll need to write something like 4320 tests (30 * 2 * 2 * 2 * 3 * 3 * 2).

What suggestions do you have for testing state machines?


Edit: I am playing with all of the suggestions, and will mark an answer when I find one that works best.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

北城挽邺 2024-09-06 01:19:58

我看到了问题所在,但我肯定会尝试将逻辑分开。

我眼中最大的问题是:

  • 它有 31 种可能的状态。
  • 它具有以下输入:
    • 枚举:当前状态(因此 0 -> 30)
    • 枚举:来源(当前只有 2 个条目)
    • 布尔值:请求
    • 布尔值:类型
    • 枚举:状态(3 个状态)
    • 枚举:处理(3 种状态)
    • 布尔值:已完成

有发生的事情太多了。输入使代码难以测试。您说过将其分成更易于管理的区域会很痛苦,但是在运行中测试这么多逻辑同样甚至更痛苦。就您而言,每个单元测试涵盖的内容太多了。

这个我问的有关测试大型方法的问题本质上是相似的,我发现我的单位是实在是太大了。您最终仍会进行许多测试,但它们会更小、更易于管理,覆盖的范围也更少。但这只能是一件好事。

测试遗留代码

查看Pex。您声称您继承了此代码,因此这实际上不是测试驱动开发。您只需要单元测试来涵盖每个方面。这是一件好事,因为任何进一步的工作都将得到验证。我个人还没有正确使用 Pex,但是我看到的视频让我惊叹不已。本质上,它将根据输入生成单元测试,在本例中,输入是有限状态机本身。它将生成您没有足够考虑的测试用例。当然这不是 TDD,但在这种情况下,测试遗留代码,它应该是理想的。

一旦获得测试覆盖率,您就可以开始重构或添加新功能,并确保良好的测试覆盖率的安全性,以确保不会破坏任何现有功能。

I see the problem, but I'd definitely try splitting the logic out.

The big problem area in my eyes is:

  • It has 31 possible states to be in.
  • It has the following inputs:
    • Enum: Current State (so 0 -> 30)
    • Enum: source (currently only 2 entries)
    • Boolean: Request
    • Boolean: type
    • Enum: Status (3 states)
    • Enum: Handling (3 states)
    • Boolean: Completed

There is just far too much going on. The input is making the code hard to test. You've said it would be painful to split this up into more manageable areas, but it's equally if not more painful to test this much logic in on go. In your case, each unit test covers far too much ground.

This question I asked about testing large methods is similar in nature, I found my units were simply too big. You'll still end up with many tests, but they'll be smaller and more manageable, covering less ground. This can only be a good thing though.

Testing Legacy Code

Check out Pex. You claim you inherited this code, so this is not actually Test-Driven-Development. You simply want unit tests to cover each aspect. This is a good thing, as any further work will be validated. I've personally not used Pex properly yet, however I was wowed by the video I saw. Essentially it will generate unit tests based on the input, which in this case would be the finite state machine itself. It will generate test cases you will not have enough thought of. Granted this is not TDD, but in this scenario, testing legacy code, it should be ideal.

Once you have your test coverage, you can begin refactoring, or adding new features with the safety of good test coverage to ensure you don't break any existing functionality.

许久 2024-09-06 01:19:58

全对测试

为了限制要测试的组合数量并合理地确保您覆盖了最重要的组合,您应该考虑一下全对测试。

全对测试背后的推理
是这样的:a中最简单的错误
程序通常由一个
单个输入参数。下一个
最简单的错误类别包括
那些依赖于相互作用的
参数对之间,可以
被全对测试发现。1
涉及交互的错误
三个或更多参数是
逐渐不常见2,同时
同时逐渐更多
通过详尽的查找很昂贵
测试,其限制是
对所有可能的情况进行详尽的测试
输入。

另请参阅此处的先前答案(无耻插件)了解更多信息以及全对和全对的链接。 pict 作为工具。

Pict 模型文件示例

给定模型生成 93 个测试用例,涵盖所有输入参数对。

#
# This is a PICT  model for testing a complex state machine at work 
#

CurrentState  :0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
Source        :1,2
Request       :True, False
Type          :True, False
Status        :State1, State2, State3
Handling      :State1, State2, State3
Completed     :True,False

#
# One can add constraints to the model to exclude impossible 
# combinations if needed.
#
# For example:
# IF [Completed]="True" THEN CurrentState>15;
#

#
# This is the PICT output of "pict ComplexStateMachine.pict /s /r1"
#
# Combinations:    515
# Generated tests: 93

All-Pair Testing

To constraint the amount of combinations to test and to be reasonable assured you have most important combinations covered, you should take a look at all-pair testing.

the reasoning behind all-pairs testing
is this: the simplest bugs in a
program are generally triggered by a
single input parameter. The next
simplest category of bugs consists of
those dependent on interactions
between pairs of parameters, which can
be caught with all-pairs testing.1
Bugs involving interactions between
three or more parameters are
progressively less common2, whilst at
the same time being progressively more
expensive to find by exhaustive
testing, which has as its limit the
exhaustive testing of all possible
inputs.

Also take a look at a previous answer here (shameless plug) for additional information and links to both all-pair & pict as tool.

Example Pict model file

Given model generates 93 testcases, covering all pairs of input parameters.

#
# This is a PICT  model for testing a complex state machine at work 
#

CurrentState  :0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
Source        :1,2
Request       :True, False
Type          :True, False
Status        :State1, State2, State3
Handling      :State1, State2, State3
Completed     :True,False

#
# One can add constraints to the model to exclude impossible 
# combinations if needed.
#
# For example:
# IF [Completed]="True" THEN CurrentState>15;
#

#
# This is the PICT output of "pict ComplexStateMachine.pict /s /r1"
#
# Combinations:    515
# Generated tests: 93
千纸鹤带着心事 2024-09-06 01:19:58

我想不出任何简单的方法来测试这样的 FSM,而不需要变得非常迂腐并使用证明、使用机器学习技术或蛮力。

蛮力:
编写一个程序,以某种声明性方式生成所有 4320 个测试用例,其中大部分数据都是不正确的。我建议将其放入 CSV 文件中,然后使用 NUnits 参数测试之类的东西来加载所有测试用例。
现在,大多数测试用例都会失败,因此您必须手动更新声明性文件以使其正确,并随机抽取测试用例样本进行修复。

机器学习技术:
您可以使用一些向量机或 MDA 算法/启发式方法来尝试从我们上面提到的示例中学习,并教您的 ML 程序您的 FSM。然后对所有 4320 个输入运行该算法,看看两者不一致的地方。

I can't think of any easy way to do test an FSM like this with out getting really pedantic and employing proofs, using machine learning techniques, or brute force.

Brute force:
Write a something that will generate all the 4320 test cases in some declarative manner with mostly incorrect data. I would recommend putting this in a CSV file and then use something like NUnits parameteric testing to load all the test cases.
Now most of these test cases will fail so you will have to update the declarative file manually to be correct and take just a sample of the test cases randomly to fix.

Machine Learning technique:
You could employ some Vector machines or MDA algorithms/heuristics to try to learn on the sample you took from what we mentioned above and teach your ML program your FSM. Then run the algorithm on all the 4320 inputs and see where the two disagree.

梦在深巷 2024-09-06 01:19:58

您认为需要多少次测试才能“完全”测试函数 sum(int a, int b) ?在 c# 中,它会类似于 18446744056529682436 测试......比你的情况更糟糕。

我建议如下:

  1. 测试最可能的情况,
    边界条件。
  2. 测试 SUT 的一些关键部分
    分别地。
  3. 当 QA 发现错误时添加测试用例
    或在生产中。

在这种特殊情况下,最好的方法是测试系统如何从一种状态切换到另一种状态。创建 DSL 来测试状态机并使用它实现最常见的用例。例如:

Start
  .UploadDocument("name1")
  .SendDocumentOnReviewTo("user1")
  .Review()
  .RejectWithComment("Not enough details")
  .AssertIsCompleted()

为流程创建简单测试的示例如下: http://slmoloch.blogspot.com/2009/12/design-of-selenium-tests-for-aspnet_09.html

How many test do you think is needed to "completely" test function sum(int a, int b)? In c# it would be something like 18446744056529682436 tests... Much worse than in your case.

I would suggest following:

  1. Test most possible situations,
    boundary conditions.
  2. Test some critical parts of your SUT
    separately.
  3. Add test cases when bugs found by QA
    or in production.

In this particular case the best way is to test how system switches from one state to onother. Create DSL to test state machine and implement most frequent use cases using it. For Example:

Start
  .UploadDocument("name1")
  .SendDocumentOnReviewTo("user1")
  .Review()
  .RejectWithComment("Not enough details")
  .AssertIsCompleted()

The example of creating simple tests for flows is here: http://slmoloch.blogspot.com/2009/12/design-of-selenium-tests-for-aspnet_09.html

江湖彼岸 2024-09-06 01:19:58

我为一台医疗设备构建了一个有限状态机。
FSM 可以通过我定义的 XML 格式进行配置。

要定义状态机,必须依赖于使用状态图的数字电路设计经验,

您必须使用我所说的收费公路转换图。在美国东海岸,大多数高速公路都被称为收费公路。收费公路当局发布收费公路收费地图。如果收费路段有 50 个出口,则定价地图将具有 50 行 x 50 列的表,以行和列的形式详尽地列出出口。要找出进入 20 号出口和离开 30 号出口的通行费,您只需查找第 20 行和第 30 列的交点。

对于 30 个状态的状态机,收费公路转换图将是一个 30 x 30 矩阵,列出了所有30 种可能的行和列状态。让我们决定行为当前状态,列为下一个状态。

每个相交的单元格都会列出从当前状态(行)转换到下一个状态(列)的“价格”。然而,单元格将引用输入表中的一行,而不是单个 $ 值,我们可以将其称为转换 id。

在我开发的医疗设备 FSM 中,输入有字符串、枚举、整数等。输入表按列列出了这些输入刺激。

要构建输入表,您需要编写一个简单的例程来列出所有可能的输入组合。但桌子会很大。在您的情况下,该表将有 4320 行,因此有 4320 个转换 id。但它并不是一个乏味的表,因为您是通过编程生成该表的。就我而言,我编写了一个简单的 JSP 来在浏览器上列出转换输入表(和收费公路表),或者下载为 csv 以在 MS Excel 中显示。

构建这两个表有两个方向。

  1. 设计方向,在其中构建收费公路表中所有可能的过渡,将不可到达的过渡灰显。然后仅针对每个可达转换构建所有预期输入的输入表,并将行号作为转换 ID。每个转换 ID 都被转录到收费公路转换地图的相应单元格上。然而,由于 FSM 是稀疏矩阵,因此并非所有转换 id 都会在收费公路转换图的单元格中使用。此外,一个转换 ID 可以多次使用,因为相同的转换条件可以应用于多对状态更改。

  2. 测试方向是相反的,您可以在其中构建输入表。
    您必须为详尽的转换测试编写通用例程。
    该例程将首先读取转换排序表以使状态机进入入口点状态以开始测试循环。在每个当前状态,它都准备好运行所有 4320 个转换 ID。在 Turnpike 转换图中的每一行 CURRENT 状态上,有效 NEXT 状态的列数有限。

您可能希望例程循环遍历从输入表中读取的所有 4320 行输入,以确保未使用的转换 id 对当前状态没有影响。您想要测试所有有效的转换 ID 是否都是有效的转换。

但你不能 - 因为一旦注入有效的转换,它会将机器的状态更改为 NEXT 状态,并阻止你完成对先前 CURRENT 状态的其余转换 id 的测试。一旦机器改变状态,你必须再次从转换 ID 0 开始测试。

过渡路径可以是循环的或不可逆的,或者沿着路径具有循环和不可逆部分的组合。

在测试例程中,每个状态都需要一个寄存器来记住注入该状态的最后一个转换 ID。每当测试达到有效的转换 ID 时,该转换 ID 就会保留在该寄存器中。这样,当您完成一个循环并返回到已遍历的状态时,您将开始迭代大于寄存器中存储的转换 ID 的下一个转换 ID。

您的例程必须处理转换路径的不可逆部分,当机器进入最终状态时,它会从入口点状态重新启动测试,并重申下一个转换 ID 大于存储的输入的 4320 个输入对于一个国家。这样,你就能够详尽地发现机器所有可能的转移路径。

幸运的是,FSM 是有效转换的稀疏矩阵,因为详尽的测试不会消耗转换 ID 数 x 可能状态数平方的完整组合。然而,如果您正在处理旧版 FSM,其中视觉或温度状态无法反馈到测试系统中,而您必须以视觉方式监控每个状态,则会出现困难。这会很丑陋,但我们仍然花了两周的时间对设备进行了视觉上的额外测试,仅进行了有效的转换。

如果您的 FSM 允许您通过简单的重置到达入口点并应用转换 ID 就可以了,那么您可能不需要转换排序表(对于要读取的测试例程的每个入口点状态,以将机器带到所需的入口点)到入口点状态。但是,让您的例程能够读取转换排序表是很有用的,因为您经常需要进入状态网络的中间并从那里开始测试。

您应该熟悉转换和状态映射的使用,因为它对于检测机器的所有可能和未记录的状态非常有用,并且如果用户确实希望它们变灰(转换无效且状态无法访问),则采访用户。

我的优势是它是一个新设备,我可以选择设计状态机控制器来读取 xml 文件,这意味着我可以按照我想要的方式更改状态机的行为,实际上是客户想要的方式,并且我确信未使用的转换 ID 确实没有效果。

对于有限状态机控制器的 java 列表 http://code.google.com/p/synthfuljava/source/browse/#svn/trunk/xml/org/synthful。不包括测试例程。

I had constructed a finite state machine for a piece of medical equipment.
The FSM was configurable through an XML format I had defined.

To define a state-machine, one has to rely on experience on digital circuit designs of using state maps,

You have to use what I term as a turnpike transition map. In the United States East Coast, most highways are nicknamed turnpikes. Turnpike authorities issue a turnpike toll pricing map. If a toll section had 50 exits, the pricing map would have a 50rows x 50cols table, listing the exits exhaustively as both rows and columns. To find out the toll charge for entering exit 20 and exiting exit 30, you simply look for the intersect of row 20 and column 30.

For a state machine of 30 states, the turnpike transition map would be a 30 x 30 matrix listing all the 30 possible states row and column wise. Let us decide the rows to be CURRENT states and columns to be NEXT states.

Each intersecting cell would list the "price" of transitioning from a CURRENT state(row) to a NEXT state(col). However instead of a single $ value, the cell would refer to a row in the Inputs table, which we could term as the transition id.

In the medical equipment FSM I developed, there were inputs that are String, enums, int, etc. The Inputs table listed these input stimulus column-wise.

To construct the Inputs table, you would write a simple routine to list all possible combinations of inputs. But the table would be huge. In your case, the table would have 4320 rows and hence 4320 transition ids. But its not a tedious table because you generated the table programmatically. In my case, I wrote a simple JSP to list the transitions input table (and the turnpike table) on browser or download as csv to be displayed in MS Excel.

There are two directions in constructing these two tables.

  1. the design direction, where you construct the turnpike table all possible transitions, graying out non-reachable transitions. Then consturct the Inputs table of all expected inputs for each reachable transition only, with the row number as transition id. Each transition id is transcribed onto the respective cell of the turnpike transition map. However, since the FSM is a sparse matrix, not all transition ids will be used in the cells of the turnpike transition map. Also, a transition id can be used many multiple times because the same transition conditions can apply to more than one pair of state change.

  2. the test direction is reverse, where you construct the Inputs table.
    You have to write a general routine for the exhaustive transition test.
    The routine would first read a transition sequencing table to bring the state-machine it to an entrypoint state to start a test cycle. At each CURRENT state, it is poised to run through all 4320 transition ids. On each row of CURRENT states in the Turnpike transition map, there would be a limited number of columns valid NEXT states.

You would want the routine to cycle thro all 4320 rows of inputs that it reads from the Inputs table, to ensure unused transition ids have no effect on a CURRENT state. You want to test that all effectual transition ids are valid transitions.

But you cannot - because once an effectual transition is pumped in, it would change the state of the machine into a NEXT state and prevent you from completing testing the rest of the transition ids on that previous CURRENT state. Once the machine changes state, you have to start testing from transition id 0 again.

Transition paths can be cyclical or irreversible or having combination of cyclical and irreversible sections along the path.

Within your test routine, you need a register for each state to memorise the last transition id pumped into that state. Everytime the test reaches an effectual transition id, that transition id is left in that register. So that when you complete a cycle and return to an already traversed state, you start iterating on next transition id greater than the one stored in the register.

Your routine would have to take care of the irreversible sections of a transition path, wheb a machine is brought to a final state, it restarts the test from the entry point state, reiterating the 4320 inputs from the next transition id greater than the one stored for a state. In this way, you would be able to exhaustively discover all the possible transition paths of the machine.

Fortunately, FSMs are sparse matrices of effectual transitions because exhaustive testing would not consume the complete combination of number of transition ids x number of possible states squared. However, the difficulty occurs if you are dealing with a legacy FSM where visual or temperature states cannot be fed back into the test system, where you have to monitor each state visually. That would be ugly, but still we spent two weeks additionally testing the equipment visually going through only the effectual transitions.

You may not need a transition sequencing table (for each entry point state for the test routine to read to bring the machine to a desired entrypoint) if your FSM allows you to reach an entrypoint with a simple reset and applying a transition id would simply it to an entrypoint state. But having your routine capable of reading a transition sequencing table is useful because frequently, you would need to go into the midst of the state network and start your testing from there.

You should acquaint yourself with the use of transition and state maps because it is very useful to detect all the possible and undocumented states of a machine and interview users if they actually wanted them grayed out (transitions made ineffectual and states made unreachable).

The advantage I had was that it was a new piece of equipment and I had the choice to design the state-machine controller to read xml files which means I could change the behaviour of the state machine anyway I wanted, actually anyway the customer wanted and I was able to assure that unused transition ids were really ineffectual.

For the java listing of the finite state machine controller http://code.google.com/p/synthfuljava/source/browse/#svn/trunk/xml/org/synthful. Test routines not included.

↙温凉少女 2024-09-06 01:19:58

根据要求进行测试。如果每当 Completed 为真时,需要某个状态移动到某个其他状态,则编写一个测试,自动循环其他输入的所有组合(这应该只是几个 for 循环),以证明其他输入是正确地忽略了。您最终应该对每个过渡弧进行一次测试,我估计大约是 100 或 150 次测试,而不是 4000 次。

Test based on the requirements. If a certain state is required to move to a certain other state whenever Completed is true, then write a test that automatically cycles through all the combinations of the other inputs (this should just be a couple for loops) to prove that the other inputs are correctly ignored. You should end up with one test for each transition arc, which I'd estimate would be somewhere on the order of 100 or 150 tests, not 4000.

烟沫凡尘 2024-09-06 01:19:58

您可以考虑研究基于模型的测试。在这种情况下,有一些工具可以帮助生成测试。我通常推荐MBT

You might consider investigating Model Based Testing. There are a few tools available to help with test generation in situations like this. I usually recommend MBT.

枕花眠 2024-09-06 01:19:58

暴力覆盖测试似乎只是一个开始。

Brute force with coverage tests seems to be a very beginning.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文