如何在 C 中定义和使用位数组?
我想创建一个非常大的数组,在上面写入“0”和“1”。我正在尝试模拟一种称为随机顺序吸附的物理过程,其中长度为 2 的二聚体单位沉积在 n 维晶格的随机位置上,且彼此不重叠。当晶格上没有足够的空间来沉积更多二聚体时(晶格被堵塞),该过程就会停止。
最初,我从零格开始,二聚体由一对“1”表示。当每个二聚体沉积时,由于二聚体不能重叠,二聚体左侧的位点被封闭。所以我通过在晶格上沉积三个“1”来模拟这个过程。我需要多次重复整个模拟,然后计算出平均覆盖率%。
我已经使用一维和二维晶格的字符数组完成了此操作。目前,在处理 3D 问题和更复杂的概括之前,我正在尝试使代码尽可能高效。
这基本上就是代码在 1D 中的样子,经过简化:
int main()
{
/* Define lattice */
array = (char*)malloc(N * sizeof(char));
total_c = 0;
/* Carry out RSA multiple times */
for (i = 0; i < 1000; i++)
rand_seq_ads();
/* Calculate average coverage efficiency at jamming */
printf("coverage efficiency = %lf", total_c/1000);
return 0;
}
void rand_seq_ads()
{
/* Initialise array, initial conditions */
memset(a, 0, N * sizeof(char));
available_sites = N;
count = 0;
/* While the lattice still has enough room... */
while(available_sites != 0)
{
/* Generate random site location */
x = rand();
/* Deposit dimer (if site is available) */
if(array[x] == 0)
{
array[x] = 1;
array[x+1] = 1;
count += 1;
available_sites += -2;
}
/* Mark site left of dimer as unavailable (if its empty) */
if(array[x-1] == 0)
{
array[x-1] = 1;
available_sites += -1;
}
}
/* Calculate coverage %, and add to total */
c = count/N
total_c += c;
}
对于我正在做的实际项目,它不仅涉及二聚体,还涉及三聚体、四聚体以及各种形状和大小(对于 2D 和 3D)。
我希望我能够使用单个位而不是字节,但我一直在阅读,据我所知,一次只能更改 1 个字节,所以我需要做一些复杂的索引或者有更简单的方法吗?
感谢您的回答
I want to create a very large array on which I write '0's and '1's. I'm trying to simulate a physical process called random sequential adsorption, where units of length 2, dimers, are deposited onto an n-dimensional lattice at a random location, without overlapping each other. The process stops when there is no more room left on the lattice for depositing more dimers (lattice is jammed).
Initially I start with a lattice of zeroes, and the dimers are represented by a pair of '1's. As each dimer is deposited, the site on the left of the dimer is blocked, due to the fact that the dimers cannot overlap. So I simulate this process by depositing a triple of '1's on the lattice. I need to repeat the entire simulation a large number of times and then work out the average coverage %.
I've already done this using an array of chars for 1D and 2D lattices. At the moment I'm trying to make the code as efficient as possible, before working on the 3D problem and more complicated generalisations.
This is basically what the code looks like in 1D, simplified:
int main()
{
/* Define lattice */
array = (char*)malloc(N * sizeof(char));
total_c = 0;
/* Carry out RSA multiple times */
for (i = 0; i < 1000; i++)
rand_seq_ads();
/* Calculate average coverage efficiency at jamming */
printf("coverage efficiency = %lf", total_c/1000);
return 0;
}
void rand_seq_ads()
{
/* Initialise array, initial conditions */
memset(a, 0, N * sizeof(char));
available_sites = N;
count = 0;
/* While the lattice still has enough room... */
while(available_sites != 0)
{
/* Generate random site location */
x = rand();
/* Deposit dimer (if site is available) */
if(array[x] == 0)
{
array[x] = 1;
array[x+1] = 1;
count += 1;
available_sites += -2;
}
/* Mark site left of dimer as unavailable (if its empty) */
if(array[x-1] == 0)
{
array[x-1] = 1;
available_sites += -1;
}
}
/* Calculate coverage %, and add to total */
c = count/N
total_c += c;
}
For the actual project I'm doing, it involves not just dimers but trimers, quadrimers, and all sorts of shapes and sizes (for 2D and 3D).
I was hoping that I would be able to work with individual bits instead of bytes, but I've been reading around and as far as I can tell you can only change 1 byte at a time, so either I need to do some complicated indexing or there is a simpler way to do it?
Thanks for your answers
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果我还不算太晚,此页面提供了精彩的解释和示例。
int
数组可用于处理bits
数组。假设int
的大小为4 字节
,当我们谈论int
时,我们正在处理32 位
。假设我们有int A[10]
,意味着我们正在处理10*4*8 = 320 位
,下图显示了它:(数组的每个元素有 4 个大的块,每个块代表一个字节
,每个较小的块代表一个位
)因此,要设置数组
A
中的第k
位:或在缩短版本中
类似于清除
kth 位:
并测试第 k 位:
如上所述,这些操作也可以编写为宏:
If I am not too late, this page gives awesome explanation with examples.
An array of
int
can be used to deal with array ofbits
. Assuming size ofint
to be4 bytes
, when we talk about anint
, we are dealing with32 bits
. Say we haveint A[10]
, means we are working on10*4*8 = 320 bits
and following figure shows it: (each element of array has 4 big blocks, each of which represent abyte
and each of the smaller blocks represent abit
)So, to set the
k
th bit in arrayA
:or in the shortened version
similarly to clear
k
th bit:and to test if the
k
th bit:As said above, these manipulations can be written as macros too:
现在,bfield_t 中的每个 long 可以保存 sizeof(long)*8 位。
您可以通过以下方式计算所需的大值的索引:
以及您的位数 然后
您可以查找所需的长值,然后从中屏蔽掉您需要的位。
或者
第一个在某些CPU上可能会更快,或者可以节省您向后移动所需的时间
在多个位数组中的相同位之间执行操作。它还镜像
现场的设置和清除比第二个实施更紧密。
set:
clear:
您应该记住,您可以对保存字段的长数使用按位运算
这与对各个位的操作相同。
您可能还想研究 ffs、fls、ffc 和 flc 函数(如果有)。 ffs 应始终在
strings.h
中可用。它的存在就是为了这个目的——一串位。无论如何,它是找到第一组,本质上是:
这是处理器的常见操作,需要有一条指令,并且您的编译器可能会生成该指令,而不是调用像我编写的那样的函数。顺便说一下,x86 有一个关于此的指令。哦,除了 take long 和 long long 之外,ffsl 和 ffsll 是相同的函数。
Now, each long in a bfield_t can hold sizeof(long)*8 bits.
You can calculate the index of a needed big by:
and your bit number by
You can then look up the long you need and then mask out the bit you need from it.
or
The first one may be faster on some cpus or may save you shifting back up of you need
to perform operations between the same bit in multiple bit arrays. It also mirrors
the setting and clearing of a bit in the field more closely than the second implemention.
set:
clear:
You should remember that you can use bitwise operations on the longs that hold the fields
and that's the same as the operations on the individual bits.
You'll probably also want to look into the ffs, fls, ffc, and flc functions if available. ffs should always be avaiable in
strings.h
. It's there just for this purpose -- a string of bits.Anyway, it is find first set and essentially:
This is a common operation for processors to have an instruction for and your compiler will probably generate that instruction rather than calling a function like the one I wrote. x86 has an instruction for this, by the way. Oh, and ffsl and ffsll are the same function except take long and long long, respectively.
您可以使用 & (按位与)和 << (左移)。
例如,(1<<3)产生二进制的“00001000”。因此,您的代码可能如下所示:
然后只需使用字符数组对其进行扩展,并找出要首先修改的适当字节。
为了提高效率,您可以提前定义一个位域列表并将它们放入一个数组中:
然后您可以避免位移位的开销,并且可以索引您的位,将前面的代码转换为:
或者,如果您可以使用 C++,您可以只使用
std::vector
,它在内部定义为位向量,并具有直接索引。You can use & (bitwise and) and << (left shift).
For example, (1 << 3) results in "00001000" in binary. So your code could look like:
Then just scale it up with an array of chars and figure out the appropriate byte to modify first.
For more efficiency, you could define a list of bitfields in advance and put them in an array:
Then you avoid the overhead of the bit shifting and you can index your bits, turning the previous code into:
Alternatively, if you can use C++, you could just use an
std::vector<bool>
which is internally defined as a vector of bits, complete with direct indexing.bitarray.h:
使用:
编辑文档:
“dword”=“双字”= 32位值(无符号,但这并不重要)
getbit (array,i)
获取包含位 i 的双字,并将双字右移,使位 i 成为最低有效位。然后,按位与 1 清除所有其他位。putbit(array, i, v)
首先检查 v 的最低有效位;如果它是 0,我们必须清除该位,如果它是 1,我们必须设置它。为了设置该位,我们对包含该位的双字进行按位或操作,并将 1 的值左移 by bit_index_in_dword:该位被设置,其他位也被设置不改变。
为了清除该位,我们对包含该位的双字进行按位与操作,并按 bit_index_in_dword 对 1 左移进行按位补码: value 将所有位设置为 1,除了我们要清除的位置中唯一的零位。
该宏以
, 0
结尾,否则它将返回存储位 i 的双字值,并且该值没有意义。还可以使用((void)0)
。bitarray.h:
Use:
EDIT the docs:
"dword" = "double word" = 32-bit value (unsigned, but that's not really important)
getbit(array,i)
fetches the dword containing the bit i and shifts the dword right, so that the bit i becomes the least significant bit. Then, a bitwise and with 1 clears all other bits.putbit(array, i, v)
first of all checks the least significant bit of v; if it is 0, we have to clear the bit, and if it is 1, we have to set it.To set the bit, we do a bitwise or of the dword that contains the bit and the value of 1 shifted left by bit_index_in_dword: that bit is set, and other bits do not change.
To clear the bit, we do a bitwise and of the dword that contains the bit and the bitwise complement of 1 shifted left by bit_index_in_dword: that value has all bits set to one except the only zero bit in the position that we want to clear.
The macro ends with
, 0
because otherwise it would return the value of dword where the bit i is stored, and that value is not meaningful. One could also use((void)0)
.这是一个权衡:
(1) 每个 2 位值使用 1 个字节 - 简单、快速,但使用 4 倍内存
(2) 将位打包为字节 - 更复杂,一些性能开销,使用最小内存
如果您有足够的可用内存然后选择(1),否则考虑(2)。
It's a trade-off:
(1) use 1 byte for each 2 bit value - simple, fast, but uses 4x memory
(2) pack bits into bytes - more complex, some performance overhead, uses minimum memory
If you have enough memory available then go for (1), otherwise consider (2).