如何测试解释器或编译器?

发布于 2024-07-16 03:43:54 字数 258 浏览 9 评论 0原文

我一直在尝试为 Brainfuck 创建一个解释器,虽然制作、启动和运行非常简单,但我还是希望能够针对它运行测试。 我似乎无法理解可能需要编写多少测试来测试所有可能的指令组合以确保实现正确。

显然,Brainfuck 的指令集很小,但我不禁认为,随着添加更多指令,您的测试代码将呈指数级增长。 无论如何,比您的典型测试要多。

现在,我在编写编译器和解释器方面是个新手,所以我的假设很可能是错误的。

基本上,您从哪里开始测试这样的东西?

I've been experimenting with creating an interpreter for Brainfuck, and while quite simple to make and get up and running, part of me wants to be able to run tests against it. I can't seem to fathom how many tests one might have to write to test all the possible instruction combinations to ensure that the implementation is proper.

Obviously, with Brainfuck, the instruction set is small, but I can't help but think that as more instructions are added, your test code would grow exponentially. More so than your typical tests at any rate.

Now, I'm about as newbie as you can get in terms of writing compilers and interpreters, so my assumptions could very well be way off base.

Basically, where do you even begin with testing on something like this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

雨夜星沙 2024-07-23 03:43:54

测试编译器与测试其他类型的应用程序略有不同,因为编译器可以生成程序的不同汇编代码版本,只要它们都执行正确的操作即可。 但是,如果您只是测试解释器,它与任何其他基于文本的应用程序几乎相同。 这是一个以 Unix 为中心的视图:

  1. 您将需要构建一个回归测试套件。 每个测试应该有
    • 您将解释的源代码,例如test001.bf
    • 您将解释的程序的标准输入,例如test001.0
    • 您期望解释器在标准输出上生成什么,例如 test001.1
    • 您期望解释器在标准错误上产生什么,例如 test001.2(您关心标准错误,因为您想测试解释器的错误消息)
  2. 您将需要一个“运行测试”脚本来执行如下操作

    函数失败{ 
        echo "1 美元的意外差异:" 
        差价 $2 $3 
        1号出口 
      } 
    
      对于测试名称 
      做 
        tmp1=$(临时文件) 
        tmp2=$(临时文件) 
        Brainfuck $testname.bf <;   $测试名称.0 >   $tmp1 2>   $tmp2 
        [ cmp -s $testname.1 $tmp1 ] ||   失败“stdout”$testname.1$tmp1 
        [ cmp -s $testname.2 $tmp2 ] ||   失败“stderr”$testname.2$tmp2 
      完毕 
      
  3. 您会发现执行类似操作的“创建测试”脚本很有帮助

    brainfuck $testname.bf <   $测试名称.0 >   $测试名称.1 2>   $测试名称.2 
      

    仅当您完全确信解释器适用于该情况时才运行此脚本。

  4. 您将测试套件置于源代码控制之下。

  5. 修饰您的测试脚本很方便,这样您就可以省略预期为空的文件。

    修饰您

  6. 任何时候发生任何变化,您都需要重新运行所有测试。 您可能还通过 cron 作业每晚重新运行它们。

  7. 最后,您希望添加足够的测试来获得编译器源代码的良好测试覆盖率。 覆盖工具的质量差异很大,但 GNU Gcov 是一个足够的覆盖工具。

祝您的口译员一切顺利! 如果您想看到一个精心设计但没有很好记录的测试基础设施,请查看 test2 目录中的 快速 C——编译器

Testing a compiler is a little different from testing some other kinds of apps, because it's OK for the compiler to produce different assembly-code versions of a program as long as they all do the right thing. However, if you're just testing an interpreter, it's pretty much the same as any other text-based application. Here is a Unix-centric view:

  1. You will want to build up a regression test suite. Each test should have
    • Source code you will interpret, say test001.bf
    • Standard input to the program you will interpret, say test001.0
    • What you expect the interpreter to produce on standard output, say test001.1
    • What you expect the interpreter to produce on standard error, say test001.2 (you care about standard error because you want to test your interpreter's error messages)
  2. You will need a "run test" script that does something like the following

    function fail {
      echo "Unexpected differences on $1:"
      diff $2 $3
      exit 1
    }
    
    for testname
    do
      tmp1=$(tempfile)
      tmp2=$(tempfile)
      brainfuck $testname.bf < $testname.0 > $tmp1 2> $tmp2
      [ cmp -s $testname.1 $tmp1 ] || fail "stdout" $testname.1 $tmp1
      [ cmp -s $testname.2 $tmp2 ] || fail "stderr" $testname.2 $tmp2
    done
    
  3. You will find it helpful to have a "create test" script that does something like

    brainfuck $testname.bf < $testname.0 > $testname.1 2> $testname.2
    

    You run this only when you're totally confident that the interpreter works for that case.

  4. You keep your test suite under source control.

  5. It's convenient to embellish your test script so you can leave out files that are expected to be empty.

  6. Any time anything changes, you re-run all the tests. You probably also re-run them all nightly via a cron job.

  7. Finally, you want to add enough tests to get good test coverage of your compiler's source code. The quality of coverage tools varies widely, but GNU Gcov is an adequate coverage tool.

Good luck with your interpreter! If you want to see a lovingly crafted but not very well documented testing infrastructure, go look at the test2 directory for the Quick C-- compiler.

空气里的味道 2024-07-23 03:43:54

我不认为测试编译器有什么“特别”之处; 从某种意义上说,它几乎比测试某些程序更容易,因为编译器有这样一个基本的高级摘要 - 您提交源代码,它会返回(可能)编译的代码和(可能)一组诊断消息。

与任何复杂的软件实体一样,会有许多代码路径,但由于它都是非常面向数据的(文本输入、文本和字节输出),因此编写测试很简单。

I don't think there's anything 'special' about testing a compiler; in a sense it's almost easier than testing some programs, since a compiler has such a basic high-level summary - you hand in source, it gives you back (possibly) compiled code and (possibly) a set of diagnostic messages.

Like any complex software entity, there will be many code paths, but since it's all very data-oriented (text in, text and bytes out) it's straightforward to author tests.

凉宸 2024-07-23 03:43:54

我写了一篇关于编译器测试的文章,其最初的结论(为发布而稍微缓和)是:重新发明轮子在道德上是错误的。除非您已经了解所有现有的解决方案并且已经了解忽略它们的一个很好的理由是,您应该首先查看已经存在的工具。 最简单的起点是 Gnu C Torture,但请记住,它基于Deja Gnu,容我们说,它有问题。 (我什至尝试了六次才让维护者允许 有关 Hello World 示例的严重错误报告到邮件列表中。)

我会毫不谦虚地建议您将以下内容作为调查工具的起点:

  1. 软件:实践和体验 2007 年 4 月。 (付费软件,不对公众开放——免费预印本位于 http://pobox.com/ ~flash/Practical_Testing_of_C99.pdf

  2. http://en.wikipedia.org/wiki/Compiler_ Correctness#Testing(大部分由我编写。)

  3. 编译器测试参考书目(请让我知道我错过的任何更新。)

I’ve written an article on compiler testing, the original conclusion of which (slightly toned down for publication) was: It’s morally wrong to reinvent the wheel. Unless you already know all about the preexisting solutions and have a very good reason for ignoring them, you should start by looking at the tools that already exist. The easiest place to start is Gnu C Torture, but bear in mind that it’s based on Deja Gnu, which has, shall we say, issues. (It took me six attempts even to get the maintainer to allow a critical bug report about the Hello World example onto the mailing list.)

I’ll immodestly suggest that you look at the following as a starting place for tools to investigate:

  1. Software: Practice and Experience April 2007. (Payware, not available to the general public---free preprint at http://pobox.com/~flash/Practical_Testing_of_C99.pdf.

  2. http://en.wikipedia.org/wiki/Compiler_correctness#Testing (Largely written by me.)

  3. Compiler testing bibliography (Please let me know of any updates I’ve missed.)

和我恋爱吧 2024-07-23 03:43:54

就 Brainfuck 而言,我认为测试应该使用 Brainfuck 脚本来完成。 不过,我会测试以下内容:

1:所有单元格是否都初始化为 0

2:当数据指针当前指向第一个单元格时递减数据指针时会发生什么? 它包裹吗? 它是否指向无效内存?

3:当数据指针指向最后一个单元格时增加它会发生什么? 它包裹吗? 它是否指向无效内存

4: 输出功能是否正确

5: 输入功能是否正确

6: [ ] 内容是否正常工作

7: 当您将一个字节递增超过 255 次时会发生什么,它是否正确换行为 0,或者是它被错误地视为整数或其他值。

也可以进行更多测试,但这可能是我开始的地方。 几年前我编写了一个 BF 编译器,其中有一些额外的测试。 特别是我通过在块内放置大量代码对 [ ] 内容进行了大量测试,因为我的代码生成器的早期版本在那里存在问题(在使用 jxx 的 x86 上,当块生成超过 128 个字节左右时,我遇到了问题)代码,导致 x86 asm 无效)。

In the case of brainfuck, I think testing it should be done with brainfuck scripts. I would test the following, though:

1: Are all the cells initialized to 0

2: What happens when you decrement the data pointer when it's currently pointing to the first cell? Does it wrap? Does it point to invalid memory?

3: What happens when you increment the data pointer when it's pointing at the last cell? Does it wrap? Does it point to invalid memory

4: Does output function correctly

5: Does input function correctly

6: Does the [ ] stuff work correctly

7: What happens when you increment a byte more than 255 times, does it wrap to 0 properly, or is it incorrectly treated as an integer or other value.

More tests are possible too, but this is probably where i'd start. I wrote a BF compiler a few years ago, and that had a few extra tests. Particularly I tested the [ ] stuff heavily, by having a lot of code inside the block, since an early version of my code generator had issues there (on x86 using a jxx I had issues when the block produced more than 128 bytes or so of code, resulting in invalid x86 asm).

你的笑 2024-07-23 03:43:54

您可以使用一些已经编写的应用程序进行测试。

You can test with some already written apps.

痴梦一场 2024-07-23 03:43:54

秘诀是:

  • 分离关注点
  • 遵守 Demeter 定律
  • 注入依赖项

好吧,难以测试的软件表明开发人员像 1985 年一样编写它。很遗憾地说,但利用我在这里提出的三个原则,即使是编号为 BASIC 的行也可以进行单元测试(可以将依赖项注入到 BASIC 中,因为您可以执行“转到变量”。

The secret is to:

  • Separate the concerns
  • Observe the law of Demeter
  • Inject your dependencies

Well, software that is hard to test is a sign that the developer wrote it like it's 1985. Sorry to say that, but utilizing the three principles I presented here, even line numbered BASIC would be unit testable (it IS possible to inject dependencies into BASIC, because you can do "goto variable".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文