如何对加密库进行基准测试?
什么是对加密库进行基准测试的好测试?
我们应该使用哪个单位(时间、CPU 周期...)来比较不同的加密库?
有没有什么工具、程序……?
有什么想法,欢迎评论!
感谢您的投入!
What are good tests to benchmark a crypto library?
Which unit (time,CPU cycles...) should we use to compare the differents crypto libraries?
Are there any tools, procedures....?
Any Idea, comment is welcome!
Thank you for your inputs!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
除了我上面的评论之外,美国政府还有您可能想要查看的 FIPS 计划 。它并不完美(从长远来看),但它是一个开始——你可以了解他们在评估密码学时正在考虑的事情。
我还建议查看NIST 计算机安全部门。
另外,顺便说一句……回顾一下大师(Bruce Schneier)关于密码学中的安全陷阱总是好的。另外:安全性比看起来更难。
My comments above aside, the US government has the FIPS program that you might want to look at. It's not perfect (by a long shot) but it's a start -- you can get an idea of things they were looking at when evaluation cryptography.
I also suggest looking at the Computer Security Division of the NIST.
Also, on a side note ... reviewing what the master has to say (Bruce Schneier) on the subject of Security Pitfalls in Cryptography is always good. Also: Security is harder than it looks.
我假设你指的是性能基准。我想说时间和周期都是有效的基准,因为某些代码可能在不同的体系结构上以不同的方式执行(如果它们足够不同,则可能会有很大不同)。
如果这对您来说非常重要,我会自己进行测试。您可以使用一些计时器(几乎所有语言都有一个),也可以使用一些分析器(几乎所有语言也都有其中之一)来计算您在目标平台上寻找的算法的确切性能。
如果您正在比较一种算法与另一种算法,您可以查找其他人已经收集的数据,这会给您一个粗略的想法。例如,以下是 Crypto++ 的一些基准:
http://www.cryptopp.com/benchmarks.html
请注意,它们使用 MB/Second 和周期/字节作为指标。我认为这些都是非常好的选择。
I assume you mean performance benchmarks. I would say that both time and cycles are valid benchmarks, as some code may execute differently on different architectures (perhaps wildly differently if they're different enough).
If it is extremely important to you, I would do the testing myself. You can use some timer (almost all languages have one) or you can use some profiler (almost all languages have one of these too) to figure out the exact performance for the algorithms you are looking for on your target platform.
If you are looking at one algorithm vs. another one, you can look for data that others have already gathered and that will give you a rough idea. For instance, here are some benchmarks from Crypto++:
http://www.cryptopp.com/benchmarks.html
Note that they use MB/Second and Cycles/Byte as metrics. I think those are very good choices.
我面前有一些非常好的答案,但请记住,优化是通过定时攻击<泄露关键材料的一个非常好的方法< /a> (例如,请参阅它对 AES 的破坏性有多大)。如果攻击者有任何机会可以对您的操作进行计时,您想要的不是最快的而是最恒定的可用时间库(如果有人可以监视您的库,则可能是最恒定的可用电量)。 OpenSSL 在控制当前攻击方面做得很好,但其他库不一定能做到同样的事情。
Some very good answers before me, but keep in mind optimizations are a very good way to leak key material by timing attack (for example see how devastating it can be for AES). If there is any chance an attacker can time your operations you want not the fastest but the most constant time library available (and possibly the most constant power usage available, if there is any chance someone can monitor yours). OpenSSL does a great job of keeping on top of current attacks, can't necessarily say the same things of other libraries.
以下答案是在 Crypto++ 的背景下进行的。我现在不关心其他库,如 OpenSSL、Botan、BouncyCastle 等。Crypto
++ 库有一个内置的基准测试套件。
您通常以每字节周期数来衡量性能。每字节周期数取决于 CPU 频率。另一个相关指标是以 MB/s 为单位的吞吐量。它还取决于 CPU 频率。
make bench
将创建一个名为benchmark.html
的文件。如果您想手动运行测试,那么:
它将输出一个类似 HTML 的表格,不带
和
标记。您仍然可以在网络浏览器中查看它。
您还可以在 Crypto++ Benchmarks 上查看 Crypto++ 基准测试页面。该信息已过时,并且在我们的待办事项列表中。
您还需要 Accen 来确保看起来正确。例如,SSE4.2和ARMv8有CRC32指令。每字节周期数应从大约 3 或 5 cpb(仅软件)变为大约 1 或 1.5 cpb(硬件加速)。它应该相当于在运行频率约为 2 GHz 的现代硬件上从大约 300 或 500 MB/s(仅软件)变为大约 1.5 GB/s(硬件加速)。
其他技术,如 SSE2 和 NEON,使用起来比较棘手。您应该看到理论上的每字节周期数和吞吐量,但您可能不知道它是什么。您可能需要联系算法的作者才能找到答案。例如,我们联系了 BLAKE2 的作者,了解我们的 ARMv7/ARMv8 NEON 实现是否按预期运行,因为它缺少基准测试结果 在作者的主页上。
我还发现 GCC 4.6(及更高版本)和
-O3
可以在纯软件实现中产生很大的差异。这是因为 GCC 在-O3
处大量矢量化,您可能会看到 2 倍到 2.5 倍的加速。例如,编译器可能会生成在-O2
处以 40 cpb 运行的代码。在-O3
,它可以以 15 或 19 cpb 运行。良好的 SSE2 或 NEON 实现应该比纯软件实现的性能至少高出每个字节几个周期。在同一示例中,SSE2 或 NEON 实现可以以 8 到 13 cpb 运行。还有像 OpenBenchmarking.org 这样的网站可能可以为您提供一些指标。
The answers below are in the context of Crypto++. I don't now about other libraries, like OpenSSL, Botan, BouncyCastle, etc.
The Crypto++ library has a built-in benchmarking suite.
You typically measure performance in cycles-per-byte. Cycles-per-byte depends upon the CPU frequency. Another related metric is throughput measured in MB/s. It also depends upon CPU frequency.
make bench
will create a file calledbenchmark.html
.If you want to manually run the tests, then:
It will output an HTML-like table without
<HEAD>
and<BODY>
tags. You will still be able to view it in a web browser.You can also check the Crypto++ benchmark page at Crypto++ Benchmarks. The information is dated, and its on our TODO list.
You also need accumen for what looks right. For example, SSE4.2 and ARMv8 have a CRC32 instruction. Cycles-per-byte should go from about 3 or 5 cpb (software only) to about 1 or 1.5 cpb (hardware acceleration). It should equate to a change of roughly 300 or 500 MB/s (software only) to roughly 1.5 GB/s (hardware acceleration) on modern hardware running around 2 GHz.
Other technologies, like SSE2 and NEON, are trickier to work with. There's a theoretical cycles-per-byte and throughput you should see, but you may not know what it is. You may need to contact the authors of the algorithm to find out. For example, we contacted the authors of BLAKE2 to learn if our ARMv7/ARMv8 NEON implementation was performing as expected because it was missing benchmark results on the author's homepage.
I've also found GCC 4.6 (and above) and
-O3
can make a big difference in software-only implementations. That's because GCC heavily vectorizes at-O3
, and you might witness a 2x to 2.5x speedup.For example, the compiler may generate code that runs at 40 cpb at-O2
. At-O3
it may run at 15 or 19 cpb. A good SSE2 or NEON implementation should outperform the software-only implementation by at least a few cycles per byte. In the same example, the SSE2 or NEON implementation may run at 8 to 13 cpb.There's also sites like OpenBenchmarking.org that may be able to provide some metrics for you.