标准（简单？）基准代码/测试？

发布于 2024-11-15 06:54:01 字数 425 浏览 3 评论 0原文

是否有某种标准的基准测试系统或大纲之类的？我正在研究 go、llvm、d 和其他语言，我想知道它们在执行时间、内存使用等方面如何公平。

我发现 https://benchmarksgame-team.pages.debian.net/benchmarksgame/ 但代码不一样。一个例子是 C++ 源代码是 << 100 行，而 C 源大于 650。我很难说这公平。其源代码中的另一个测试存在一个愚蠢的错误，即在循环内放置锁，而其他语言则将其放置在循环外。

所以我想知道一些我可能会考虑查看/运行的测试，这些测试可能不使用非标准甚至复杂的库。就像完全在单个源文件中实现一样。公平的事情。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

唠甜嗑 2024-11-22 06:54:02

基准测试并不完全是为了公平——而是为了在你的限制范围内为你自己的工作量选择一些东西。

如果您想使用 alioth 枪战网站，如果您排除太冗长或太慢的解决方案，您仍然可以获得有趣的信息（确切的平衡取决于您想要做什么 - 您是否编写运行五秒钟的代码，或者一台将占用十几台计算机五个月的计算机）。查看一个特定问题的最简洁的示例，以了解一般问题结构 - 然后看看人们应用了哪些典型优化来使代码运行得更快。

使用相同的代码进行基准测试没有抓住要点，因为您需要不同的东西来帮助不同的语言； Java 有 GC，这意味着它在 trees 测试中表现良好，而您需要 C/C++ 中的自定义内存分配来与之竞争（并且特定的基准测试是结构化的，因此标准 malloc 确实可以做到这一点）差），对于spectral-norm，您需要非盒装双精度数组...

如果您想提出自己的解决方案，请尝试欧拉计划 -有很多问题并不依赖于复杂的库，但优化起来却很困难。否则，尝试提出您认为足以过滤或排名枪战中（或枪战之外）现有贡献的评分标准 - 例如，对于某些问题有 ShedSkin 和 Cython 解决方案，这些解决方案是“非官方”的，因为这些不包括语言）。

回复收藏 0 原文

蝶舞 2024-11-22 06:54:01

多年来，基准游戏网站在帮助页面上都强调了这一点 -

“不公平”是什么意思？（寓言）

他们跑上跑下，跑来跑去，前后左右，上下颠倒。

猎豹的朋友们说“这不公平” - 每个人都知道猎豹是最快的生物，但比赛时间太长，猎豹会累！

猎鹰的朋友们说“这不公平” - 每个人都知道猎鹰是最快的生物，但猎鹰走路不太好，他在天空中翱翔！

马的朋友说“这不公平” - 每个人都知道马是跑得最快的动物，但这只是一岁马，你必须停止比赛，直到有种公马参加！

曼的朋友说“这不公平” - 每个人都知道，在“现实世界”曼会使用摩托车，你必须等到曼给引擎加油并预热！

蜗牛的朋友们说“这不公平” - 每个人都知道生物应该留下粘液痕迹，所有其他生物都在作弊！

达尔马提亚狗的尾巴敲击着地面。达尔马提亚喘着粗气，喘着粗气说：“看看那座美丽的山，我们跑到山顶吧！”

当时“这不公平”的评论大多是特殊的恳求，旨在为编程语言 X 带来优势，而不是编程语言 Y 的劣势。

但是你的问题提出的问题有点不同。

首先，查看n-body
基准游戏上的程序
网站。尽管节目
用不同的语言写成
差别很小
程序的编码方式。
目前还没有人找到有效的
利用四核的方法
这个小 n 体问题 - 所以有
没有特殊的多核程序。
该程序不使用非标准
或复杂的库。节目
完全在一个内部实现
单个源文件。
我说的很少
n体方式的差异
程序已编码但这样做
真正的意思是程序是
相同的？项目落地后不久
复活了，六七年前我
还记得一位 Ada 程序员吗
半开玩笑地将苹果与
橙子因为汇编语言
来自 Ada 程序的不是
与 C 中的汇编语言相同
程序 - 显然不是
与类似的东西进行比较:-)
- otoh Ada 源代码将具有
  以不同的方式写
  比编写C源代码，
  使 Ada 编译器生成
  与 C 编译器相同的汇编语言
  已制作。
- otoh 如果汇编语言由
  两个编译器实际上都是逐行的
  一样的，为什么会有
  性能差异？
当方式差异很小时
程序被编码，然后乍一看
比较似乎公平，但是强迫
不同的语言可以像 X 语言一样进行编码
可能更喜欢 X 语言。
正如 Yannick Versley 指出的，这一点
使用不同的语言是为了
不同的方法
语言提供。换句话说，
有不止一种方法可以做到
同样的事情。
查看 mandelbrot 程序
基准测试游戏网站 - the
最简单的 C 程序大小只有一半
最快的 C 程序；这
最简单的 C 程序是顺序的并且
使用双打，最快的 C 程序
通过 OMP 和 GCC 使用全部 4 个内核
内在函数。
- 其他语言采用不同的方法来使用所有 4 个核心 - 这是否意味着我们应该只比较顺序程序而忽略多核计算的现实？
- 其他语言实现可能无法提供与 GCC 内在函数等效的功能 - 这是否意味着我们应该只比较使用双精度的程序？但是其他语言实现采用不同的方法，它们表示双倍的方式 - 这是否意味着我们应该忽略所有浮点程序？

问题是编程语言（和编程语言实现）与苹果和橙子之间的差异更大，但我们仍然会问 - 如果我用 X 语言编写我的程序会更快吗？ - 并且仍然希望一个比 - 这取决于你如何编写它更简单的答案！

基准游戏网站上的不同任务和不同程序表明，一些性能比较答案令人困惑和复杂 -细节很重要。

For several years the benchmarks game website featured this on the Help page -

What does "not fair" mean? (A fable)

They raced up, and down, and around and around and around, and forwards and backwards and sideways and upside-down.

Cheetah's friends said "it's not fair" - everyone knows Cheetah is the fastest creature but the races are too long and Cheetah gets tired!

Falcon's friends said "it's not fair" - everyone knows Falcon is the fastest creature but Falcon doesn't walk very well, he soars across the sky!

Horse's friends said "it's not fair" - everyone knows Horse is the fastest creature but this is only a yearling, you must stop the races until a stallion takes part!

Man's friends said "it's not fair" - everyone knows that in the "real world" Man would use a motorbike, you must wait until Man has fueled and warmed up the engine!

Snail's friends said "it's not fair" - everyone knows that a creature should leave a slime trail, all those other creatures are cheating!

Dalmatian's tail was banging on the ground. Dalmatian panted and between breaths said "Look at that beautiful mountain, let's race to the top!"

At that time "it's not fair" comments were mostly special pleading intended to gain an advantage for programming language X to the disadvantage of programming language Y.

But the issues your question raises are a little different.

Firstly, look at the n-body
programs on the benchmarks game
website. Even though the programs
are written in different languages
there's very little difference in
the way the programs are coded.
So far no one has found an effective
way to make use of quad-core for
this small n-body problem - so there
are no special multi-core programs.
The programs do not use non-standard
or complex libraries. The programs
are completely implemented inside a
single source file.
I said there's very little
difference in the way the n-body
programs are coded but does that
really mean the programs are the
same? Soon after the project had
been revived, 6 or 7 years ago I
remember an Ada programmer
half-joked about comparing apples to
oranges because the assembly language
from the Ada programs wasn't the
same as the assembly language from the C
programs - so obviously like wasn't
being compared to like :-)
- otoh the Ada source code would have
  to be written in a different way
  than the C source code was written,
  to make the Ada compiler produce the
  same assembly language as the C compiler
  produced.
- otoh if the assembly language produced by
  both compilers really was line-by-line
  the same, why would there be a
  performance difference?
When there's very little difference in the way
the programs are coded then at first glance the
comparison appears to be fair, but forcing
different languages to be coded like language X
may favour language X.
As Yannick Versley noted, the point
of using a different language is for
the different approaches that
language provides. In other words,
there's more than one way to do the
same thing.
Look at the mandelbrot programs on
the benchmarks game website - the
simplest C program is half the size
of the fastest C program; the
simplest C program is sequential and
uses doubles, the fastest C program
uses all 4 cores through OMP and GCC
intrinsics.
- Other languages take different approaches to use all 4 cores - does that mean we should only compare sequential programs and ignore the reality of multi-core computing?
- Other language implementations may not provide an equivalent to GCC intrinsics - does that mean we should only compare programs that use doubles? But other language implementations take different approaches in the way they represent doubles - does that mean we should ignore all floating point programs?

The problem is that programming languages (and programming language implementations) are more different than apples to oranges, but we still ask - Will my program be faster if I write it in language X? - and still wish for a simpler answer than - It depends how you write it!

The different tasks and different programs on the benchmarks game website show that some of the performance comparison answers are confusing and complicated - the details matter, a lot.

回复收藏 0 原文

~没有更多了~