超空间优化代码

发布于 2024-09-03 20:11:23 字数 307 浏览 1 评论 0原文

有一些关键的独立算法 - 特别是与加密相关的算法,例如 AES、RSA、SHA1 等 - 您可以在互联网上免费找到它们的许多实现。

有些被编写为良好且可移植的干净 C。

有些被编写为快速 - 通常使用宏和显式展开。

据我所知,没有人试图变得特别小 - 所以我只能编写自己的 - 明确的 AES128 解密和 ARM THUMB2 的 SHA1。 (我已经通过使用 GCC 和 -Os 和 -mthumb 等编译我能找到的所有目标机器来进行验证)

我可以使用哪些模式和技巧来做到这一点?

是否有可以汇总代码的编译器/工具?

There are key self-contained algorithms - particularly cryptography-related such as AES, RSA, SHA1 etc - which you can find many implementations of for free on the internet.

Some are written to be nice and portable clean C.

Some are written to be fast - often with macros, and explicit unrolling.

As far as I can tell, none are trying to be especially super-small - so I'm resigned to writing my own - explicitly AES128 decryption and SHA1 for ARM THUMB2. (I've verified by compiling all I can find for my target machine with GCC with -Os and -mthumb and such)

What patterns and tricks can I use to do so?

Are there compilers/tools that can roll-up code?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

む无字情书 2024-09-10 20:11:23

在优化空间(或速度)之前:现在编译器非常聪明,您是否尝试过,如果您告诉编译器优化空间,aes128 的正常可读实现是否足够小以满足您的需求?

去编写你自己的 aes128 版本也许是一件很好的教育事情,但你肯定会与 bug 作斗争,而密码学并不是那种凭空出现的琐碎的东西。错误或薄弱(由于实现中的一些错误)几乎是最糟糕的情况。

因为您的目标是 ARM,而 gcc 作为该平台的编译器非常常见:

-Os   Optimize for size.  

     -Os enables all -O2 optimizations that do not typically 
      increase code size. It also performs further optimizations
      designed to reduce code size.

before optimizing for space (or speed): compilers are pretty clever these days, have you tried if a normal, readable implementation of aes128 gets small enough for your needs if you tell the compiler to optimize for space?

to go and write your own version of aes128 is perhaps a good educational thing but you will fight for bugs for sure and cryptography is not that kind of trivial stuff that falls out of thin air. and faulty or weak (due some bugs of your implementation) is pretty much the worse case you can have.

since you are targetting ARM and gcc is pretty common as a compiler for that platform:

-Os   Optimize for size.  

     -Os enables all -O2 optimizations that do not typically 
      increase code size. It also performs further optimizations
      designed to reduce code size.
疯到世界奔溃 2024-09-10 20:11:23

这取决于您要优化的空间类型:代码或数据。常用的 AES128 本质上有三种变体,每种变体的预计算查找表空间量有所不同。

  • 最快的版本使用 4k,排列为四个 32 位 x 256 条目查找表(通常称为 T 表)。如果您能负担得起那么多的数据空间,那么此版本中唯一的指令就是组合表结果的 EOR,这些指令将汇总成非常小的代码片段。
  • 中间版本使用 8 位 x 256 条目查找表来对 SBox 进行编码。剩余指令需要实现移位行和混合列步骤,因此代码大小较大。
  • 最小(数据大小)版本根本不使用任何查找表,但需要计算所有单独的 AES 字段操作,包括反转。即使您将域乘法和求逆都折叠到子例程中,这也会使用最多的指令。

It depends on what kind of space you are trying to optimise: code or data. There are essentially three variants of AES128 commonly in use, each differing in the amount of precomputed lookup table space.

  • The fastest version uses 4k arranged as four 32-bit x 256 entry lookup tables (commonly called T-tables). If you can afford that amount of data space then the only instructions in this version are the EORs to combine the table results, these will roll up into a very small piece of code.
  • The intermediate version uses a 8-bit x 256 entry lookup table to encode the SBox. The residual instructions need to implement the shift rows and mix columns steps so the code size is larger.
  • The smallest (data-size) version doesn't use any lookup tables at all, but needs to compute all of the individual AES-field operations including the inversion. This will use the most instructions, even if you fold both the field-multiply and inversion into subroutines.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文