使用 BitConverter 在 C# 中进行快速转换,还能更快吗?

发布于 2024-10-16 01:02:24 字数 440 浏览 6 评论 0原文

在我们的应用程序中,我们有一个非常大的字节数组,我们必须将这些字节转换为不同的类型。目前,我们使用 BitConverter.ToXXXX() 来实现此目的。我们的重磅人物是 ToInt16ToUInt64

对于UInt64,我们的问题是数据流实际上有6个字节的数据来表示一个大整数。由于没有本地函数将 6 字节数据转换为 UInt64,我们这样做:

UInt64 value = BitConverter.ToUInt64() & 0x0000ffffffffffff;

我们对 ToInt16 的使用更简单,不需要进行任何位操作。

我们做了很多这两个操作,因此我想询问 SO 社区是否有更快的方法来进行这些转换。目前,这两个函数消耗了整个 CPU 周期的大约 20%。

In our application, we have a very large byte-array and we have to convert these bytes into different types. Currently, we use BitConverter.ToXXXX() for this purpose. Our heavy hitters are, ToInt16 and ToUInt64.

For UInt64, our problem is that the data stream has actually 6-bytes of data to represent a large integer. Since there is no native function to convert 6-bytes of data to UInt64, we do:

UInt64 value = BitConverter.ToUInt64() & 0x0000ffffffffffff;

Our use of ToInt16 is simpler, do don't have to do any bit manipulation.

We do so many of these 2 operations that I wanted to ask the SO community whether there's a faster way to do these conversions. Right now, approximately 20% of our entire CPU cycles is consumed by these two functions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

想挽留 2024-10-23 01:02:24

你有没有想过直接使用内存指针。我不能保证它的性能,但这是 C++\C 中的常见技巧...

        byte[] arr = { 1, 2, 3, 4, 5, 6, 7, 8 ,9,10,11,12,13,14,15,16};

        fixed (byte* a2rr = &arr[0])
        {

            UInt64* uint64ptr = (UInt64*) a2rr;
            Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
            uint64ptr = (UInt64*) ((byte*) uint64ptr+6);
            Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
        }

您需要在构建设置中使程序集“不安全”,并标记执行此操作的方法也不安全。通过这种方法,您还可以与小尾数法联系起来。

Have you thought about using memory pointers directly. I can't vouch for its performance but it is a common trick in C++\C...

        byte[] arr = { 1, 2, 3, 4, 5, 6, 7, 8 ,9,10,11,12,13,14,15,16};

        fixed (byte* a2rr = &arr[0])
        {

            UInt64* uint64ptr = (UInt64*) a2rr;
            Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
            uint64ptr = (UInt64*) ((byte*) uint64ptr+6);
            Console.WriteLine("The value is {0:X2}", (*uint64ptr & 0x0000FFFFFFFFFFFF));
        }

You'll need to make your assembly "unsafe" in the build settings as well as mark the method in which you'd be doing this unsafe aswell. You are also tied to little endian with this approach.

帅哥哥的热头脑 2024-10-23 01:02:24

您可以使用 System.Buffer 类将整个数组快速复制到不同类型的另一个数组, “块复制”操作:

BlockCopy 方法使用内存中的偏移量来访问 src 参数数组中的字节,而不是使用索引或数组上限和下限等编程结构。

数组类型必须是“原始”类型,它们必须对齐,并且复制操作是字节序敏感的。对于 6 字节整数,它无法与任何 .NET 的“原始”类型对齐,除非您可以获得每 6 个字节有两个字节填充的源数组,然后该数组将与 Int64< /代码>。但此方法适用于 Int16 数组,这可能会加快某些操作的速度。

You can use the System.Buffer class to copy a whole array over to another array of a different type as a fast, 'block copy' operation:

The BlockCopy method accesses the bytes in the src parameter array using offsets into memory, not programming constructs such as indexes or upper and lower array bounds.

The array types must be of 'primitive' types, they must align, and the copy operation is endian-sensitive. In your case of 6-bytes integers, it can't align with any of .NET's 'primitive' types, unless you can obtain the source array with two bytes of padding for each six, which will then align to Int64. But this method will work for arrays of Int16, which may speed up some of your operations.

赤濁 2024-10-23 01:02:24

为什么不呢:

UInt16 valLow = BitConverter.ToUInt16();
UInt64 valHigh = (UInt64)BitConverter.ToUInt32();
UInt64 Value = (valHigh << 16) | valLow;

您可以将其设为一条语句,尽管 JIT 编译器可能会自动为您执行此操作。

这将阻止您读取最终丢弃的额外两个字节。

如果这不会减少 CPU,那么您可能需要编写自己的转换器来直接从缓冲区读取字节。您可以使用数组索引,或者如果您认为有必要,也可以使用带有指针的不安全代码。

请注意,正如评论者指出的那样,如果您使用这些建议中的任何一个,那么您要么仅限于特定的“字节序”,要么您必须编写代码来检测小/大字节序并做出相应的反应。我上面展示的代码示例适用于小端 (x86)。

Why not:

UInt16 valLow = BitConverter.ToUInt16();
UInt64 valHigh = (UInt64)BitConverter.ToUInt32();
UInt64 Value = (valHigh << 16) | valLow;

You can make that a single statement, although the JIT compiler will probably do that for you automatically.

That will prevent you from reading those extra two bytes that you end up throwing away.

If that doesn't reduce CPU, then you'll probably want to write your own converter that reads the bytes directly from the buffer. You can either use array indexing or, if you think it's necessary, unsafe code with pointers.

Note that, as a commenter pointed out, if you use any of these suggestions, then either you're limited to a particular "endian-ness", or you'll have to write your code to detect little/big endian and react accordingly. The code sample I showed above works for little endian (x86).

要走就滚别墨迹 2024-10-23 01:02:24

请参阅我对类似问题的回答 这里
这与吉米的答案相同,是不安全的内存操作,但对消费者来说是一种更“友好”的方式。它将允许您将 byte 数组视为 UInt64 数组。

See my answer for a similar question here.
It's the same unsafe memory manipulation as in Jimmy's answer, but in a more "friendly" way for consumers. It'll allow you to view your byte array as UInt64 array.

ら栖息 2024-10-23 01:02:24

对于任何其他偶然发现此问题的人,如果您只需要小端,不需要自动检测大端并从中转换。然后我编写了 bitconverter 的扩展版本,其中添加了许多附加内容来处理 Span 以及转换 T 类型的数组,例如 int[] 或 timestamp[]

还扩展了支持的类型,包括时间戳、小数和日期时间。

https://github.com/tcwicks/ChillX/ blob/master/src/ChillX.Serialization/BitConverterExtended.cs

用法示例:

Random rnd = new Random();
RentedBuffer<byte> buffer = RentedBuffer<byte>.Shared.Rent(BitConverterExtended.SizeOfUInt64
    + (20 * BitConverterExtended.SizeOfUInt16)
    + (20 * BitConverterExtended.SizeOfTimeSpan)
    + (10 * BitConverterExtended.SizeOfSingle);
UInt64 exampleLong = long.MaxValue;
int startIndex = 0;
startIndex += BitConverterExtended.GetBytes(exampleLong, buffer.BufferSpan, startIndex);

UInt16[] shortArray = new UInt16[20];
for (int I = 0; I < shortArray.Length; I++) { shortArray[I] = (ushort)rnd.Next(0, UInt16.MaxValue); }
//When using reflection / expression trees CLR cannot distinguish between UInt16 and Int16 or Uint64 and Int64 etc...
//Therefore Uint methods are renamed.
startIndex += BitConverterExtended.GetBytesUShortArray(shortArray, buffer.BufferSpan, startIndex);

TimeSpan[] timespanArray = new TimeSpan[20];
for (int I = 0; I < timespanArray.Length; I++) { timespanArray[I] = TimeSpan.FromSeconds(rnd.Next(0, int.MaxValue)); }
startIndex += BitConverterExtended.GetBytes(timespanArray, buffer.BufferSpan, startIndex);

float[] floatArray = new float[10];
for (int I = 0; I < floatArray.Length; I++) { floatArray[I] = MathF.PI * rnd.Next(short.MinValue, short.MaxValue); }
startIndex += BitConverterExtended.GetBytes(floatArray, buffer.BufferSpan, startIndex);

//Do stuff with buffer and then
buffer.Return(); //always better to return it as soon as possible
//Or in case you forget
buffer = null;
//and let RentedBufferContract do this automatically

它支持读取和写入 byte[] 或 RentedBuffer,但是使用 RentedBuffer 类大大减少了 GC 收集开销。
RentedBufferContract 类在内部处理将缓冲区返回到池中以防止内存泄漏。

还包括一个类似于消息包的序列化器。
注意:MessagePack 是一个更快的序列化器,具有更多功能,但是该序列化器通过读取和写入租用的字节缓冲区来减少 GC 收集开销。

https://github.com/tcwicks/ChillX/ blob/master/src/ChillX.Serialization/ChillXSerializer.cs

For anyone else who stumbles across this if you only need little endian and do not need to auto detect big endian and convert from that. Then I've written an extended version of bitconverter with a number of additions to handle Span as well as converting arrays of type T for example int[] or timestamp[]

Also extended the types supported to include timestamp, decimal and datetime.

https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Serialization/BitConverterExtended.cs

Example usage:

Random rnd = new Random();
RentedBuffer<byte> buffer = RentedBuffer<byte>.Shared.Rent(BitConverterExtended.SizeOfUInt64
    + (20 * BitConverterExtended.SizeOfUInt16)
    + (20 * BitConverterExtended.SizeOfTimeSpan)
    + (10 * BitConverterExtended.SizeOfSingle);
UInt64 exampleLong = long.MaxValue;
int startIndex = 0;
startIndex += BitConverterExtended.GetBytes(exampleLong, buffer.BufferSpan, startIndex);

UInt16[] shortArray = new UInt16[20];
for (int I = 0; I < shortArray.Length; I++) { shortArray[I] = (ushort)rnd.Next(0, UInt16.MaxValue); }
//When using reflection / expression trees CLR cannot distinguish between UInt16 and Int16 or Uint64 and Int64 etc...
//Therefore Uint methods are renamed.
startIndex += BitConverterExtended.GetBytesUShortArray(shortArray, buffer.BufferSpan, startIndex);

TimeSpan[] timespanArray = new TimeSpan[20];
for (int I = 0; I < timespanArray.Length; I++) { timespanArray[I] = TimeSpan.FromSeconds(rnd.Next(0, int.MaxValue)); }
startIndex += BitConverterExtended.GetBytes(timespanArray, buffer.BufferSpan, startIndex);

float[] floatArray = new float[10];
for (int I = 0; I < floatArray.Length; I++) { floatArray[I] = MathF.PI * rnd.Next(short.MinValue, short.MaxValue); }
startIndex += BitConverterExtended.GetBytes(floatArray, buffer.BufferSpan, startIndex);

//Do stuff with buffer and then
buffer.Return(); //always better to return it as soon as possible
//Or in case you forget
buffer = null;
//and let RentedBufferContract do this automatically

it supports reading from and writing to both byte[] or RentedBuffer however using the RentedBuffer class greatly reduces GC collection overheads.
RentedBufferContract class internally handles returning buffers to the pool to prevent memory leaks.

Also includes a serializer which is similar to messagepack.
Note: MessagePack is a faster serializer with more features however this serializer reduces GC collection overheads by reading from and writing to rented byte buffers.

https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Serialization/ChillXSerializer.cs

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文