字符串与字节数组,性能
(这篇文章是关于高频类型编程的)
我最近在一个论坛上看到(我认为他们正在讨论 Java),如果你必须解析大量字符串数据,最好使用字节数组而不是带有 split() 的字符串。确切的帖子是:
使用任何语言(C++、Java、C#)的一个性能技巧是 以避免创建对象。这不是分配或GC的成本,而是 访问不适合 CPU 缓存的大型内存阵列的成本。
现代 CPU 的速度比内存快得多。他们为许多人拖延, 每个缓存未命中的周期数。大部分 CPU 晶体管预算是 分配以通过大缓存和大量滴答来减少这种情况。
GPU 通过准备大量线程来以不同的方式解决问题 执行以隐藏内存访问延迟,并且具有很少或没有缓存,并且 将晶体管用在更多的内核上。
因此,例如,不要使用 String 和 split 来解析 消息,使用可以就地更新的字节数组。你真的想要 避免对大型数据结构进行随机内存访问,至少在 内循环。
他只是说“不要使用字符串,因为它们是一个对象,并且创建对象的成本很高”?还是他在说别的?
使用字节数组是否可以确保数据尽可能长时间地保留在缓存中? 当你使用一个字符串时,它是否太大而无法保存在CPU缓存中? 一般来说,使用原始数据类型是编写更快代码的最佳方法吗?
(This post is regarding High Frequency type programming)
I recently saw on a forum (I think they were discussing Java) that if you have to parse a lot of string data its better to use a byte array than a string with a split(). The exact post was:
One performance trick to working with any language, C++, Java, C# is
to avoid object creation. It's not the cost of allocation or GC, its
the cost to access large memory arrays that dont fit in the CPU cache.Modern CPU's are much faster than their memory. They stall for many,
many cycles for each cache miss. Most of the CPU transister budget is
allocated to reduce this with large caches and lots of ticks.GPU's solve the problem differently by having lots of threads ready to
execute to hide memory access latency and have little or no cache and
spend the transistors on more cores.So, for example, rather than using String's and split to parse a
message, use byte arrays that can be updated in place. You really want
to avoid random memory access over large data structures, at least in
the inner loops.
Is he just saying "dont use strings because they're an object and creating objects is costly" ? Or is he saying something else?
Does using a byte array ensure the data remains in the cache for as long as possible?
When you use a string is it too large to be held in the CPU cache?
Generally, is using the primitive data types the best methods for writing faster code?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
他说,如果将大块文本分解为单独的字符串对象,那么这些字符串对象的局部性比大文本数组的局部性更差。每个字符串及其包含的字符数组都将位于内存中的其他位置;它们可以散布到各处。当您处理数据时,内存缓存可能必须进出才能访问各种字符串。相比之下,一个大数组具有最佳的局部性,因为所有数据都位于内存的一个区域上,并且缓存抖动将保持在最低限度。
当然,这样做是有限制的:如果文本非常非常大,并且您只需要解析其中的一部分,那么这些小字符串可能比大块文本更适合缓存。
He's saying that if you break a chunk text up into separate string objects, those string objects have worse locality than the large array of text. Each string, and the array of characters it contains, is going to be somewhere else in memory; they can be spread all over the place. It is likely that the memory cache will have to thrash in and out to access the various strings as you process the data. In contrast, the one large array has the best possible locality, as all the data is on one area of memory, and cache-thrashing will be kept to a minimum.
There are limits to this, of course: if the text is very, very large, and you only need to parse out part of it, then those few small strings might fit better in the cache than the large chunk of text.
使用
byte[]
或char*
而不是字符串进行 HFT 的原因还有很多。 Java 中的字符串由 16 位char
组成,并且是不可变的。byte[]
或ByteBuffer
很容易回收,具有良好的缓存位置,可以在堆外(直接)保存副本,避免字符编码器。这一切都假设您使用的是 ASCII 数据。char*
或 ByteBuffers 也可以映射到网络适配器以保存另一个副本。 (对 ByteBuffers 进行一些摆弄)在 HFT 中,你很少会同时处理大量数据。理想情况下,您希望在数据到达套接字后立即对其进行处理。即一次一包。 (约 1.5 KB)
There are lots of other reasons to use
byte[]
orchar*
instead of Strings for HFT. Strings consists of 16-bitchar
in Java and are immutable.byte[]
orByteBuffer
are easily recycled, have good cache locatity, can be off the heap (direct) saving a copy, avoiding character encoders. This all assumes you are using ASCII data.char*
or ByteBuffers can also be mapped to network adapters to save another copy. (With some fiddling for ByteBuffers)In HFT you are rarely dealing with large amounts of data at once. Ideally you want to be processing data as soon as it comes down the Socket. i.e. one packet at a time. (about 1.5 KB)