iPhone 上 int 到 float 转换最快的速度是多少?
我正在将一些 Int16 和 Int32 转换为浮点数,然后再转换回来。
我只是使用直接转换,但每秒执行 44100 次(猜猜它的用途是什么?:))
转换效率高吗?可以做得更快吗?
拇指的 PS 编译已关闭。
I am converting some Int16s and Int32s to float and then back again.
I'm just using a straight cast, but doing this 44100 times per second (any guesses what its for? :) )
Is a cast efficient? Can it be done any faster?
P.S Compile for thumb is turned off.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
只有两种方法可以知道。
1)阅读编译器生成的代码,以在您的情况下将整数提升为浮点数。
2) 衡量编译器生成的代码与其他选项的性能。
要执行前者,请将 SDK 设置为“Device”,将“Active Architecture”设置为“arm”,然后选择“Build”>“Build”。显示汇编代码。然后阅读编译器生成的代码。
如果您比编译器更聪明,那么您可以编写自己的汇编代码并使用它。很可能你不是。
如果您多次执行某个操作,Instruments 将很好地向您显示它正在获取多少个处理器样本。但吉姆的观点是正确的,你不应该认为它没有帮助而忽略它:在涉及浮点数数学运算的操作中,编译器类型提升是你最不用担心的。芯片的设计目的是在两到三个周期内完成这一任务,编译器通常设法实现这一目标。但是您正在进行的效果处理可能需要数千个周期。促销活动将在喧嚣中消失。
There are only two ways to know.
1) Read the code the compiler generates for promoting ints to floats in your case.
2) Measure the performance of the code the compiler generates vs. other options.
To do the former, set the SDK to Device and the Active Architecture to arm, and choose Build > Show Assembly Code. Then read the compiler-generated code.
If you are smarter than a compiler then you can write your own assembly code and use it instead. Odds are you aren't.
If you are doing an operation many, many times, Instruments will do a good job at showing you how many processor samples it's taking. But Jim's point is valid, and you shouldn't dismiss it as unhelpful: in an operation involving math on floating-point numbers, compiler type promotion is the least of your worries. Chips are built to do that in two or three cycles, and compilers usually manage to make that happen. But the effects processing you're doing will probably take thousands of cycles. The promotion will be lost in the noise.
演员阵容高效吗?就你而言,我猜它足够有效。
可以做得更快吗?也许……但这值得付出努力吗?您是否对其进行了基准测试并发现了由于强制转换操作而导致的性能问题?
如果您正在使用浮点样本数据做任何数学上不平凡的事情,
如果演员阵容最终成为一个重大瓶颈,我会感到非常惊讶!
Is a cast efficient? In your case, I'd guess it's efficient enough.
Can it be done faster? Maybe...but would it be worth the effort? Have you benchmarked it and discovered a performance problem due to the cast operations?
If you're doing anything mathematically nontrivial with the floating point sample data,
I'd be really surprised if the casts turned out to be a significant bottleneck!