如何在c中进行位集/字节数组转换
给定一个数组, unsigned char q[32]="1100111..."
,
如何生成 4 字节位集,unsigned char p[4]
,这样,该位组的位等于数组内的值,例如,第一个字节 p[0]= "q[0] ... q[7]";第二个字节 p[1]="q[8] ... q[15]" 等,
以及如何相反地做到这一点,即给定位集,生成数组?
我自己对第一部分进行了尝试。
unsigned char p[4]={0};
for (int j=0; j<N; j++)
{
if (q[j] == '1')
{
p [j / 8] |= 1 << (7-(j % 8));
}
}
上面的说法对吗?有什么条件要检查吗?还有更好的办法吗?
编辑 - 1
我想知道以上是否是有效的方法?因为数组大小可能高达 4096 甚至更多。
Given an array,unsigned char q[32]="1100111..."
,
how can I generate a 4-bytes bit-set, unsigned char p[4]
, such that, the bit of this bit-set, equals to value inside the array, e.g., the first byte p[0]= "q[0] ... q[7]"; 2nd byte p[1]="q[8] ... q[15]", etc.
and also how to do it in opposite, i.e., given bit-set, generate the array?
my own trial out for the first part.
unsigned char p[4]={0};
for (int j=0; j<N; j++)
{
if (q[j] == '1')
{
p [j / 8] |= 1 << (7-(j % 8));
}
}
Is the above right? any conditions to check? Is there any better way?
EDIT - 1
I wonder if above is efficient way? As the array size could be upto 4096 or even more.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
首先,使用
strtoul
获取 32 位值。然后使用htonl
将字节顺序转换为big-endian。最后,将结果存储在数组中:还有其他方法。
但我缺少
!然后你需要知道你的平台是什么字节顺序。如果它是大端字节序,则
htonl
不执行任何操作并且可以省略。如果它是小端字节序,那么htonl
只是:如果幸运的话,您的优化器可能会看到您在做什么并将其转换为高效的代码。如果没有,那么至少它都可以在寄存器中实现,并且时间复杂度为 O(log N)。
如果您不知道您的平台是什么字节顺序,那么您需要检测它:
也许
long
是 8 个字节!好吧,OP 暗示 4 字节输入及其数组大小,但 8 字节
long
是可行的:对于不是 8 位的
char
(DSP 喜欢这样做),你就靠你自己了。 (这就是为什么当 SHARC 系列 DSP 具有 8 位字节时这是一件大事;它使移植现有代码变得更加容易,因为面对现实,C 在可移植性支持方面做得很糟糕。)任意长度怎么样?缓冲区?请不要使用有趣的指针类型转换。
OP 版本可以改进的主要内容是重新考虑循环的内部结构。不要将输出字节视为固定数据寄存器,而应将其视为移位寄存器,其中每个连续位都移至右端 (LSB)。这将使您免于所有这些部门和模组(希望它们能够针对位移进行优化)。
为了理智起见,我放弃了
unsigned char
而使用uint8_t
。您有责任确保
inChars
以 null 终止。该函数将在它看到的第一个非'0'
或'1'
字符处返回,或者如果它用完了输出缓冲区。一些示例用法:这仅读取 4 个字节,如果不能读取则捕获错误。
这只是转换它能转换的部分,并将其余部分设置为 0 位。
如果 C 能够
break
跳出多级循环或switch
,那么这个函数可以做得更好;就目前情况而言,我必须添加一个标志值才能获得相同的效果,这很混乱,或者我必须添加一个goto
,但我只是拒绝这样做。First, Use
strtoul
to get a 32-bit value. Then convert the byte order to big-endian withhtonl
. Finally, store the result in your array:There are other ways as well.
But I lack
<arpa/inet.h>
!Then you need to know what byte order your platform is. If it's big endian, then
htonl
does nothing and can be omitted. If it's little-endian, thenhtonl
is just:If you're lucky, your optimizer might see what you're doing and make it into efficient code. If not, well, at least it's all implementable in registers and O(log N).
If you don't know what byte order your platform is, then you need to detect it:
Maybe
long
is 8 bytes!Well, the OP implied 4-byte inputs with their array size, but 8-byte
long
is doable:For
char
that isn't 8 bits (DSPs like to do this), you're on your own. (This is why it was a Big Deal when the SHARC series of DSPs had 8-bit bytes; it made it a LOT easier to port existing code because, face it, C does a horrible job of portability support.)What about arbitrary length buffers? No funny pointer typecasts, please.
The main thing that can be improved with the OP's version is to rethink the loop's internals. Instead of thinking of the output bytes as a fixed data register, think of it as a shift register, where each successive bit is shifted into the right (LSB) end. This will save you from all those divisions and mods (which, hopefully, are optimized away to bit shifts).
For sanity, I'm ditching
unsigned char
foruint8_t
.It's your responsibility to make sure
inChars
is null-terminated. The function will return on the first non-'0'
or'1'
character it sees or if it runs out of output buffer. Some example usage:This just reads 4 bytes, and traps the error if it can't.
This just converts what it can and sets the rest to 0 bits.
This function could be done better if C had the ability to
break
out of more than one level of loop orswitch
; as it stands, I'd have to add a flag value to get the same effect, which is clutter, or I'd have to add agoto
, which I simply refuse.我认为这不太有效。您将每个“位”与
1
进行比较,而实际上它应该是'1'
。您还可以通过去掉if
来提高效率:反向操作也非常简单。只需屏蔽您之前设置的每个“位”即可。
您会注意到创造性地使用了
(boolean) + '0'
在 1/0 和 '1'/'0' 之间进行转换。I don't think that will quite work. You are comparing each "bit" to
1
when it should really be'1'
. You can also make it a bit more efficient by getting rid of theif
:Going in reverse is pretty simple too. Just mask for each "bit" that you set earlier.
You'll notice the creative use of
(boolean) + '0'
to convert between 1/0 and '1'/'0'.根据你的例子,它看起来并不像你想要的可读性,并且在(后期)刷新之后,我的解决方案看起来与 Chriszuma 非常相似,除了由于操作顺序和添加 !! 而缺少括号之外。强制执行 0 或 1。
According to your example it does not look like you are going for readability, and after a (late) refresh my solution looks very similar to Chriszuma except for the lack of parenthesis due to order of operations and the addition of the !! to enforce a 0 or 1.
如果您正在寻求极高的效率,请尝试使用以下技术:
通过减去
'0'
替换if
(似乎您可以假设您的输入符号只能是 <代码>0 或1
)。还处理从较低指数到较高指数的输入。
用自动递增指针替换数组索引:
展开内部循环:
同时处理多个输入字符(使用位旋转黑客或 MMX 指令) - 这具有巨大的加速潜力!
If you are looking for extreme efficiency, try to use the following techniques:
Replace
if
by subtraction of'0'
(seems like you can assume your input symbols can be only0
or1
).Also process the input from lower indices to higher ones.
Replace array indices by auto-incrementing pointers:
Unroll the inner loop:
Process several input characters simultaneously (using bit twiddling hacks or MMX instructions) - this has great speedup potential!