标准(简单?)基准代码/测试?

发布于 2024-11-15 06:54:01 字数 425 浏览 3 评论 0原文

是否有某种标准的基准测试系统或大纲之类的?我正在研究 go、llvm、d 和其他语言,我想知道它们在执行时间、内存使用等方面如何公平。

我发现 https://benchmarksgame-team.pages.debian.net/benchmarksgame/ 但代码不一样。一个例子是 C++ 源代码是 << 100 行,而 C 源大于 650。我很难说这公平。其源代码中的另一个测试存在一个愚蠢的错误,即在循环内放置锁,而其他语言则将其放置在循环外。

所以我想知道一些我可能会考虑查看/运行的测试,这些测试可能不使用非标准甚至复杂的库。就像完全在单个源文件中实现一样。公平的事情。

Is there some kind of standard benchmarking system or outline or something? I am looking at go, llvm, d and other languages and i wanted to know how they fair in execution time, memory usage, etc.

I found https://benchmarksgame-team.pages.debian.net/benchmarksgame/ but the code is NOT THE SAME. One example is a C++ source is < 100 lines while the C source is >650. I hardly call that fair. Another test in its source has the stupid mistake of putting a lock inside the loop while other languages put it outside.

So i wanted to know some test i might consider looking at/running that perhaps uses no nonstandard or even complex libs. Like implemented completely inside a single source file. Something fair.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

唠甜嗑 2024-11-22 06:54:02

基准测试并不完全是为了公平——而是为了在你的限制范围内为你自己的工作量选择一些东西。

如果您想使用 alioth 枪战网站,如果您排除太冗长或太慢的解决方案,您仍然可以获得有趣的信息(确切的平衡取决于您想要做什么 - 您是否编写运行五秒钟的代码,或者一台将占用十几台计算机五个月的计算机)。查看一个特定问题的最简洁的示例,以了解一般问题结构 - 然后看看人们应用了哪些典型优化来使代码运行得更快。

使用相同的代码进行基准测试没有抓住要点,因为您需要不同的东西来帮助不同的语言; Java 有 GC,这意味着它在 trees 测试中表现良好,而您需要 C/C++ 中的自定义内存分配来与之竞争(并且特定的基准测试是结构化的,因此标准 malloc 确实可以做到这一点)差),对于spectral-norm,您需要非盒装双精度数组...

如果您想提出自己的解决方案,请尝试 欧拉计划 -有很多问题并不依赖于复杂的库,但优化起来却很困难。否则,尝试提出您认为足以过滤或排名枪战中(或枪战之外)现有贡献的评分标准 - 例如,对于某些问题有 ShedSkin 和 Cython 解决方案,这些解决方案是“非官方”的,因为这些不包括语言)。

Benchmarking is not entirely about being fair - it's about choosing something for your own workload, within your restraints.

If you want to use the alioth shootout site, you can still get interesting information if you exclude solutions that are too verbose, or too slow (the exact balancing depends on what you want to do - do you write code that runs for five seconds, or one that will occupy a dozen computers for five months). Look at the most concise examples for one particular problem to see the general problem structure - then see what typical optimizations people applied to make the code run faster.

Having a benchmark with THE SAME code misses the point, because you need different things to help in different languages; Java has GC, which means that it will do well on the trees test, whereas you need custom memory allocation in C/C++ to compete with that (and that particular benchmark is structured so that standard malloc does really poorly), for the spectral-norm one, you need non-boxed double arrays...

If you want to come up with your own solutions, have a go at Project Euler - there are a lot of problems that do not depend on complex libraries, yet are challenging to optimize. Otherwise, try to come up with scoring criteria that you consider adequate to filter or rank the existing contributions in the shootout (or outside it - for example, there are ShedSkin and Cython solutions to some of the problems, which are "unofficial" because these languages are not included).

蝶舞 2024-11-22 06:54:01

多年来,基准游戏网站在帮助页面上都强调了这一点 -

“不公平”是什么意思? (寓言)

他们跑上跑下,跑来跑去,前后左右,上下颠倒。

猎豹的朋友们说“这不公平” - 每个人都知道猎豹是最快的生物,但比赛时间太长,猎豹会累!

猎鹰的朋友们说“这不公平” - 每个人都知道猎鹰是最快的生物,但猎鹰走路不太好,他在天空中翱翔!

马的朋友说“这不公平” - 每个人都知道马是跑得最快的动物,但这只是一岁马,你必须停止比赛,直到有种公马参加!

曼的朋友说“这不公平” - 每个人都知道,在“现实世界”曼会使用摩托车,你必须等到曼给引擎加油并预热!

蜗牛的朋友们说“这不公平” - 每个人都知道生物应该留下粘液痕迹,所有其他生物都在作弊!

达尔马提亚狗的尾巴敲击着地面。达尔马提亚喘着粗气,喘着粗气说:“看看那座美丽的山,我们跑到山顶吧!”


当时“这不公平”的评论大多是特殊的恳求,旨在为编程语言 X 带来优势,而不是编程语言 Y 的劣势。

但是你的问题提出的问题有点不同。

  1. 首先,查看n-body
    基准游戏上的程序

    网站。尽管节目
    用不同的语言写成
    差别很小
    程序的编码方式。

    目前还没有人找到有效的
    利用四核的方法
    这个小 n 体问题 - 所以有
    没有特殊的多核程序。
    该程序不使用非标准
    或复杂的库。节目
    完全在一个内部实现
    单个源文件。

  2. 我说的很少
    n体方式的差异
    程序已编码但这样做
    真正的意思是程序是
    相同的?项目落地后不久
    复活了,六七年前我
    还记得一位 Ada 程序员吗
    半开玩笑地将苹果与
    橙子因为汇编语言
    来自 Ada 程序的不是
    与 C 中的汇编语言相同
    程序 - 显然不是
    与类似的东西进行比较:-)

    • otoh Ada 源代码将具有
      以不同的方式写
      比编写C源代码,
      使 Ada 编译器生成
      与 C 编译器相同的汇编语言
      已制作。

    • otoh 如果汇编语言由
      两个编译器实际上都是逐行的
      一样的,为什么会有
      性能差异?

    当方式差异很小时
    程序被编码,然后乍一看
    比较似乎公平,但是强迫
    不同的语言可以像 X 语言一样进行编码
    可能更喜欢 X 语言。

  3. 正如 Yannick Versley 指出的,这一点
    使用不同的语言是为了
    不同的方法
    语言提供。换句话说,
    有不止一种方法可以做到
    同样的事情。

    查看 mandelbrot 程序
    基准测试游戏网站 - the
    最简单的 C 程序大小只有一半
    最快的 C 程序;这
    最简单的 C 程序是顺序的并且
    使用双打,最快的 C 程序
    通过 OMP 和 GCC 使用全部 4 个内核
    内在函数。

    • 其他语言采用不同的方法来使用所有 4 个核心 - 这是否意味着我们应该只比较顺序程序而忽略多核计算的现实?

    • 其他语言实现可能无法提供与 GCC 内在函数等效的功能 - 这是否意味着我们应该只比较使用双精度的程序?但是其他语言实现采用不同的方法,它们表示双倍的方式 - 这是否意味着我们应该忽略所有浮点程序?

问题是编程语言(和编程语言实现)与苹果和橙子之间的差异更大,但我们仍然会问 - 如果我用 X 语言编写我的程序会更快吗? - 并且仍然希望一个比 - 这取决于你如何编写它更简单的答案!

基准游戏网站上的不同任务和不同程序表明,一些性能比较答案令人困惑和复杂 -细节很重要。

For several years the benchmarks game website featured this on the Help page -

What does "not fair" mean? (A fable)

They raced up, and down, and around and around and around, and forwards and backwards and sideways and upside-down.

Cheetah's friends said "it's not fair" - everyone knows Cheetah is the fastest creature but the races are too long and Cheetah gets tired!

Falcon's friends said "it's not fair" - everyone knows Falcon is the fastest creature but Falcon doesn't walk very well, he soars across the sky!

Horse's friends said "it's not fair" - everyone knows Horse is the fastest creature but this is only a yearling, you must stop the races until a stallion takes part!

Man's friends said "it's not fair" - everyone knows that in the "real world" Man would use a motorbike, you must wait until Man has fueled and warmed up the engine!

Snail's friends said "it's not fair" - everyone knows that a creature should leave a slime trail, all those other creatures are cheating!

Dalmatian's tail was banging on the ground. Dalmatian panted and between breaths said "Look at that beautiful mountain, let's race to the top!"


At that time "it's not fair" comments were mostly special pleading intended to gain an advantage for programming language X to the disadvantage of programming language Y.

But the issues your question raises are a little different.

  1. Firstly, look at the n-body
    programs
    on the benchmarks game
    website. Even though the programs
    are written in different languages
    there's very little difference in
    the way the programs are coded.

    So far no one has found an effective
    way to make use of quad-core for
    this small n-body problem - so there
    are no special multi-core programs.
    The programs do not use non-standard
    or complex libraries. The programs
    are completely implemented inside a
    single source file.

  2. I said there's very little
    difference in the way the n-body
    programs are coded but does that
    really mean the programs are the
    same? Soon after the project had
    been revived, 6 or 7 years ago I
    remember an Ada programmer
    half-joked about comparing apples to
    oranges because the assembly language
    from the Ada programs wasn't the
    same as the assembly language from the C
    programs - so obviously like wasn't
    being compared to like :-)

    • otoh the Ada source code would have
      to be written in a different way
      than the C source code was written,
      to make the Ada compiler produce the
      same assembly language as the C compiler
      produced.

    • otoh if the assembly language produced by
      both compilers really was line-by-line
      the same, why would there be a
      performance difference?

    When there's very little difference in the way
    the programs are coded then at first glance the
    comparison appears to be fair, but forcing
    different languages to be coded like language X
    may favour language X.

  3. As Yannick Versley noted, the point
    of using a different language is for
    the different approaches that
    language provides. In other words,
    there's more than one way to do the
    same thing.

    Look at the mandelbrot programs on
    the benchmarks game website - the
    simplest C program is half the size
    of the fastest C program; the
    simplest C program is sequential and
    uses doubles, the fastest C program
    uses all 4 cores through OMP and GCC
    intrinsics.

    • Other languages take different approaches to use all 4 cores - does that mean we should only compare sequential programs and ignore the reality of multi-core computing?

    • Other language implementations may not provide an equivalent to GCC intrinsics - does that mean we should only compare programs that use doubles? But other language implementations take different approaches in the way they represent doubles - does that mean we should ignore all floating point programs?

The problem is that programming languages (and programming language implementations) are more different than apples to oranges, but we still ask - Will my program be faster if I write it in language X? - and still wish for a simpler answer than - It depends how you write it!

The different tasks and different programs on the benchmarks game website show that some of the performance comparison answers are confusing and complicated - the details matter, a lot.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文