C# 计算 13 位而不是本机 16 位的开销
我正在编译一个查找表,需要有 133,784,560
条目,值范围为 0 - 7,462
可以包含的最大值为 7,462
13 位
之内。这为我们提供了大约 207 MB 的查找表。
16 位
值会增加大约 50mb
的查找表大小。
在当今时代,查找表大小的额外增加并不重要,但最好保持它尽可能薄。
当 LUT 加载到内存中时,与评估 16 位
相比,评估 13 位范围的值需要多少开销?我假设会有一些中间按位运算将其转换为计算机可工作的格式,还是我错了?
每个时钟周期都很重要,因为这将涉及一个强力分析程序,该程序将运行数十亿次比较。我应该坚持使用稍大的 LUT 吗?
I'm compiling a lookup table that needs to have 133,784,560
entries, with values ranging from 0 - 7,462
The maximum value of 7,462
can be contained within 13 bits
. This gives us a lookup table of around 207 MB.
A 16 bit
value increases our lookup table size around 50mb
more.
The extra increase in size of the lookup table is not significant in todays age, but it would be nice to keep it as thin as possible.
When the LUT is loaded into memory, how much overhead is there to evaluate the value of a range of 13 bits, compared to evaluating the 16 bits
? I'm assuming there would be some intermediary bitwise operations to convert it to a computer workable format, or am I wrong?
Every clock cycle counts, as this will be involved in a brute force analysis program that will run billions of comparisons. Should I just stick with the slightly larger LUT?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会坚持使用 16 位值而不是 13 位。由于您正在进行强力分析和数十亿次比较,因此额外的 50MB 似乎只是一个很小的代价。另请记住,管理 13 位情况的代码将更加复杂,因为您通常必须读取多个 16 位(或 32 位或其他)值并进行移位和组合才能获得您需要的实际值。换句话说,提取值 #n 比简单地“从表中检索它”要复杂得多。
然而,唯一真正确定的方法是尝试两者并查看...但是除非您有时间实现您最终可能不会使用的 13 位值检索代码,否则我可能不会这样做不用麻烦。
I would stick with 16-bit values rather than 13-bit. Since you're doing brute force analysis and billions of comparisons, the extra 50MB seems a small price to pay. Also keep in mind that the code managing the 13-bit case will be significantly more complex, as you'll usually have to read across multiple 16-bit (or 32-bit, or whatever) values and shift and combine in order to get the actual value you need. In other words, extracting value #n is going to be much more complex than simply "retrieve it from the table".
The only real way to know for sure, however, would be to try both and see... but unless you've got the time to implement the 13-bit value retrieval code that you might not end up using, I probably wouldn't bother.
我的猜测是这是过早优化的情况。位调整是相当昂贵的,并且可能会使额外的内存访问成本相形见绌,除非纯粹巧合的是,您的缓存性能达到了这两个大小之间的某个水平。
最终,没有什么可以替代尝试。
My guess would be that is a case of premature optimisation. Bit-fiddling is quite expensive, and will probably dwarf the extra memory-access cost, unless by sheer coincidence your cache performance hits an elbow somewhere between those two sizes.
Ultimately, there's no substitute for just trying it out.
我想说两种方法都尝试一下,看看哪一种更快。另外,我认为这是进入 C++ 的一个很好的选择。您可以将其封装在托管 C++ 项目中,您可以直接从 C# 引用该项目。这将允许您进行所需的所有低级优化,同时仍然可以直接访问应用程序的其余部分。
I would say try it both ways and see which one is faster. Also, I think this is a good candidate to drop into C++. You can encapsulate this in a managed C++ project which you can reference directly from C#. This will allow you to do all the low level optimizations that you want while still being directly accessible to the rest of your app.
假设您的意思是像这样将数据存储在数组中:
short[]
一样(让编译器)乘以 2。byte[]
,则为 3 个)。Assuming you mean storing data in an array like this:
short[]
.byte[]
).