如何在 C 中定义和使用位数组?

发布于 2024-08-26 08:00:12 字数 1636 浏览 11 评论 0原文

我想创建一个非常大的数组,在上面写入“0”和“1”。我正在尝试模拟一种称为随机顺序吸附的物理过程,其中长度为 2 的二聚体单位沉积在 n 维晶格的随机位置上,且彼此不重叠。当晶格上没有足够的空间来沉积更多二聚体时(晶格被堵塞),该过程就会停止。

最初,我从零格开始,二聚体由一对“1”表示。当每个二聚体沉积时,由于二聚体不能重叠,二聚体左侧的位点被封闭。所以我通过在晶格上沉积三个“1”来模拟这个过程。我需要多次重复整个模拟,然后计算出平均覆盖率%。

我已经使用一维和二维晶格的字符数组完成了此操作。目前,在处理 3D 问题和更复杂的概括之前,我正在尝试使代码尽可能高效。

这基本上就是代码在 1D 中的样子,经过简化:

int main()
{
    /* Define lattice */
    array = (char*)malloc(N * sizeof(char));

    total_c = 0;

    /* Carry out RSA multiple times */
    for (i = 0; i < 1000; i++)
        rand_seq_ads();

    /* Calculate average coverage efficiency at jamming */
    printf("coverage efficiency = %lf", total_c/1000);

    return 0;
}

void rand_seq_ads()
{
    /* Initialise array, initial conditions */
    memset(a, 0, N * sizeof(char));
    available_sites = N;
    count = 0;

    /* While the lattice still has enough room... */
    while(available_sites != 0)
    {
        /* Generate random site location */
        x = rand();

        /* Deposit dimer (if site is available) */
        if(array[x] == 0)
        {
            array[x] = 1;
            array[x+1] = 1;
            count += 1;
            available_sites += -2;
        }

        /* Mark site left of dimer as unavailable (if its empty) */
        if(array[x-1] == 0)
        {
            array[x-1] = 1;
            available_sites += -1;
        }
    }

    /* Calculate coverage %, and add to total */
    c = count/N
    total_c += c;
}

对于我正在做的实际项目,它不仅涉及二聚体,还涉及三聚体、四聚体以及各种形状和大小(对于 2D 和 3D)。

我希望我能够使用单个位而不是字节,但我一直在阅读,据我所知,一次只能更改 1 个字节,所以我需要做一些复杂的索引或者有更简单的方法吗?

感谢您的回答

I want to create a very large array on which I write '0's and '1's. I'm trying to simulate a physical process called random sequential adsorption, where units of length 2, dimers, are deposited onto an n-dimensional lattice at a random location, without overlapping each other. The process stops when there is no more room left on the lattice for depositing more dimers (lattice is jammed).

Initially I start with a lattice of zeroes, and the dimers are represented by a pair of '1's. As each dimer is deposited, the site on the left of the dimer is blocked, due to the fact that the dimers cannot overlap. So I simulate this process by depositing a triple of '1's on the lattice. I need to repeat the entire simulation a large number of times and then work out the average coverage %.

I've already done this using an array of chars for 1D and 2D lattices. At the moment I'm trying to make the code as efficient as possible, before working on the 3D problem and more complicated generalisations.

This is basically what the code looks like in 1D, simplified:

int main()
{
    /* Define lattice */
    array = (char*)malloc(N * sizeof(char));

    total_c = 0;

    /* Carry out RSA multiple times */
    for (i = 0; i < 1000; i++)
        rand_seq_ads();

    /* Calculate average coverage efficiency at jamming */
    printf("coverage efficiency = %lf", total_c/1000);

    return 0;
}

void rand_seq_ads()
{
    /* Initialise array, initial conditions */
    memset(a, 0, N * sizeof(char));
    available_sites = N;
    count = 0;

    /* While the lattice still has enough room... */
    while(available_sites != 0)
    {
        /* Generate random site location */
        x = rand();

        /* Deposit dimer (if site is available) */
        if(array[x] == 0)
        {
            array[x] = 1;
            array[x+1] = 1;
            count += 1;
            available_sites += -2;
        }

        /* Mark site left of dimer as unavailable (if its empty) */
        if(array[x-1] == 0)
        {
            array[x-1] = 1;
            available_sites += -1;
        }
    }

    /* Calculate coverage %, and add to total */
    c = count/N
    total_c += c;
}

For the actual project I'm doing, it involves not just dimers but trimers, quadrimers, and all sorts of shapes and sizes (for 2D and 3D).

I was hoping that I would be able to work with individual bits instead of bytes, but I've been reading around and as far as I can tell you can only change 1 byte at a time, so either I need to do some complicated indexing or there is a simpler way to do it?

Thanks for your answers

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

箹锭⒈辈孓 2024-09-02 08:00:12

如果我还不算太晚,页面提供了精彩的解释和示例。

int 数组可用于处理bits 数组。假设 int 的大小为 4 字节,当我们谈论 int 时,我们正在处理 32 位 。假设我们有 int A[10],意味着我们正在处理 10*4*8 = 320 位,下图显示了它:(数组的每个元素有 4 个大的块,每个块代表一个字节,每个较小的块代表一个

在此处输入图像描述

因此,要设置数组 A 中的第 k 位:

// NOTE: if using "uint8_t A[]" instead of "int A[]" then divide by 8, not 32
void  SetBit( int A[],  int k )
{
    int i = k/32;        //gives the corresponding index in the array A
    int pos = k%32;      //gives the corresponding bit position in A[i]

    unsigned int flag = 1;   // flag = 0000.....00001

    flag = flag << pos;      // flag = 0000...010...000   (shifted k positions)

    A[i] = A[i] | flag;      // Set the bit at the k-th position in A[i]
}

或在缩短版本中

void  SetBit( int A[],  int k )
{
    A[k/32] |= 1 << (k%32);  // Set the bit at the k-th position in A[i]
}

类似于清除 kth 位:

void  ClearBit( int A[],  int k )                
{
    A[k/32] &= ~(1 << (k%32));
}

并测试第 k 位:

int TestBit( int A[],  int k )
{
    return ( (A[k/32] & (1 << (k%32) )) != 0 ) ;     
}

如上所述,这些操作也可以编写为宏:

// Due order of operation wrap 'k' in parentheses in case it
// is passed as an equation, e.g. i + 1, otherwise the first
// part evaluates to "A[i + (1/32)]" not "A[(i + 1)/32]"
#define SetBit(A,k)     ( A[(k)/32] |= (1 << ((k)%32)) )
#define ClearBit(A,k)   ( A[(k)/32] &= ~(1 << ((k)%32)) )
#define TestBit(A,k)    ( A[(k)/32] & (1 << ((k)%32)) )

If I am not too late, this page gives awesome explanation with examples.

An array of int can be used to deal with array of bits. Assuming size of int to be 4 bytes, when we talk about an int, we are dealing with 32 bits. Say we have int A[10], means we are working on 10*4*8 = 320 bits and following figure shows it: (each element of array has 4 big blocks, each of which represent a byte and each of the smaller blocks represent a bit)

enter image description here

So, to set the kth bit in array A:

// NOTE: if using "uint8_t A[]" instead of "int A[]" then divide by 8, not 32
void  SetBit( int A[],  int k )
{
    int i = k/32;        //gives the corresponding index in the array A
    int pos = k%32;      //gives the corresponding bit position in A[i]

    unsigned int flag = 1;   // flag = 0000.....00001

    flag = flag << pos;      // flag = 0000...010...000   (shifted k positions)

    A[i] = A[i] | flag;      // Set the bit at the k-th position in A[i]
}

or in the shortened version

void  SetBit( int A[],  int k )
{
    A[k/32] |= 1 << (k%32);  // Set the bit at the k-th position in A[i]
}

similarly to clear kth bit:

void  ClearBit( int A[],  int k )                
{
    A[k/32] &= ~(1 << (k%32));
}

and to test if the kth bit:

int TestBit( int A[],  int k )
{
    return ( (A[k/32] & (1 << (k%32) )) != 0 ) ;     
}

As said above, these manipulations can be written as macros too:

// Due order of operation wrap 'k' in parentheses in case it
// is passed as an equation, e.g. i + 1, otherwise the first
// part evaluates to "A[i + (1/32)]" not "A[(i + 1)/32]"
#define SetBit(A,k)     ( A[(k)/32] |= (1 << ((k)%32)) )
#define ClearBit(A,k)   ( A[(k)/32] &= ~(1 << ((k)%32)) )
#define TestBit(A,k)    ( A[(k)/32] & (1 << ((k)%32)) )
飘逸的'云 2024-09-02 08:00:12
typedef unsigned long bfield_t[ size_needed/sizeof(long) ];
// long because that's probably what your cpu is best at
// The size_needed should be evenly divisable by sizeof(long) or
// you could (sizeof(long)-1+size_needed)/sizeof(long) to force it to round up

现在,bfield_t 中的每个 long 可以保存 sizeof(long)*8 位。

您可以通过以下方式计算所需的大值的索引:

bindex = index / (8 * sizeof(long) );

以及您的位数 然后

b = index % (8 * sizeof(long) );

您可以查找所需的长值,然后从中屏蔽掉您需要的位。

result = my_field[bindex] & (1<<b);

或者

result = 1 & (my_field[bindex]>>b); // if you prefer them to be in bit0

第一个在某些CPU上可能会更快,或者可以节省您向后移动所需的时间
在多个位数组中的相同位之间执行操作。它还镜像
现场的设置和清除比第二个实施更紧密。
set:

my_field[bindex] |= 1<<b;

clear:

my_field[bindex] &= ~(1<<b);

您应该记住,您可以对保存字段的长数使用按位运算
这与对各个位的操作相同。

您可能还想研究 ffs、fls、ffc 和 flc 函数(如果有)。 ffs 应始终在 strings.h 中可用。它的存在就是为了这个目的——一串位。
无论如何,它是找到第一组,本质上是:

int ffs(int x) {
    int c = 0;
    while (!(x&1) ) {
        c++;
        x>>=1;
    }
    return c; // except that it handles x = 0 differently
}

这是处理器的常见操作,需要有一条指令,并且您的编译器可能会生成该指令,而不是调用像我编写的那样的函数。顺便说一下,x86 有一个关于此的指令。哦,除了 take long 和 long long 之外,ffsl 和 ffsll 是相同的函数。

typedef unsigned long bfield_t[ size_needed/sizeof(long) ];
// long because that's probably what your cpu is best at
// The size_needed should be evenly divisable by sizeof(long) or
// you could (sizeof(long)-1+size_needed)/sizeof(long) to force it to round up

Now, each long in a bfield_t can hold sizeof(long)*8 bits.

You can calculate the index of a needed big by:

bindex = index / (8 * sizeof(long) );

and your bit number by

b = index % (8 * sizeof(long) );

You can then look up the long you need and then mask out the bit you need from it.

result = my_field[bindex] & (1<<b);

or

result = 1 & (my_field[bindex]>>b); // if you prefer them to be in bit0

The first one may be faster on some cpus or may save you shifting back up of you need
to perform operations between the same bit in multiple bit arrays. It also mirrors
the setting and clearing of a bit in the field more closely than the second implemention.
set:

my_field[bindex] |= 1<<b;

clear:

my_field[bindex] &= ~(1<<b);

You should remember that you can use bitwise operations on the longs that hold the fields
and that's the same as the operations on the individual bits.

You'll probably also want to look into the ffs, fls, ffc, and flc functions if available. ffs should always be avaiable in strings.h. It's there just for this purpose -- a string of bits.
Anyway, it is find first set and essentially:

int ffs(int x) {
    int c = 0;
    while (!(x&1) ) {
        c++;
        x>>=1;
    }
    return c; // except that it handles x = 0 differently
}

This is a common operation for processors to have an instruction for and your compiler will probably generate that instruction rather than calling a function like the one I wrote. x86 has an instruction for this, by the way. Oh, and ffsl and ffsll are the same function except take long and long long, respectively.

空‖城人不在 2024-09-02 08:00:12

您可以使用 & (按位与)和 << (左移)。

例如,(1<<3)产生二进制的“00001000”。因此,您的代码可能如下所示:

char eightBits = 0;

//Set the 5th and 6th bits from the right to 1
eightBits &= (1 << 4);
eightBits &= (1 << 5);
//eightBits now looks like "00110000". 

然后只需使用字符数组对其进行扩展,并找出要首先修改的适当字节。

为了提高效率,您可以提前定义一个位域列表并将它们放入一个数组中:

#define BIT8 0x01
#define BIT7 0x02
#define BIT6 0x04
#define BIT5 0x08
#define BIT4 0x10
#define BIT3 0x20
#define BIT2 0x40
#define BIT1 0x80

char bits[8] = {BIT1, BIT2, BIT3, BIT4, BIT5, BIT6, BIT7, BIT8};

然后您可以避免位移位的开销,并且可以索引您的位,将前面的代码转换为:

eightBits &= (bits[3] & bits[4]);

或者,如果您可以使用 C++,您可以只使用 std::vector ,它在内部定义为位向量,并具有直接索引。

You can use & (bitwise and) and << (left shift).

For example, (1 << 3) results in "00001000" in binary. So your code could look like:

char eightBits = 0;

//Set the 5th and 6th bits from the right to 1
eightBits &= (1 << 4);
eightBits &= (1 << 5);
//eightBits now looks like "00110000". 

Then just scale it up with an array of chars and figure out the appropriate byte to modify first.

For more efficiency, you could define a list of bitfields in advance and put them in an array:

#define BIT8 0x01
#define BIT7 0x02
#define BIT6 0x04
#define BIT5 0x08
#define BIT4 0x10
#define BIT3 0x20
#define BIT2 0x40
#define BIT1 0x80

char bits[8] = {BIT1, BIT2, BIT3, BIT4, BIT5, BIT6, BIT7, BIT8};

Then you avoid the overhead of the bit shifting and you can index your bits, turning the previous code into:

eightBits &= (bits[3] & bits[4]);

Alternatively, if you can use C++, you could just use an std::vector<bool> which is internally defined as a vector of bits, complete with direct indexing.

死开点丶别碍眼 2024-09-02 08:00:12

bitarray.h

#include <inttypes.h> // defines uint32_t

//typedef unsigned int bitarray_t; // if you know that int is 32 bits
typedef uint32_t bitarray_t;

#define RESERVE_BITS(n) (((n)+0x1f)>>5)
#define DW_INDEX(x) ((x)>>5)
#define BIT_INDEX(x) ((x)&0x1f)
#define getbit(array,index) (((array)[DW_INDEX(index)]>>BIT_INDEX(index))&1)
#define putbit(array, index, bit) \
    ((bit)&1 ?  ((array)[DW_INDEX(index)] |= 1<<BIT_INDEX(index)) \
             :  ((array)[DW_INDEX(index)] &= ~(1<<BIT_INDEX(index))) \
             , 0 \
    )

使用:

bitarray_t arr[RESERVE_BITS(130)] = {0, 0x12345678,0xabcdef0,0xffff0000,0};
int i = getbit(arr,5);
putbit(arr,6,1);
int x=2;            // the least significant bit is 0
putbit(arr,6,x);    // sets bit 6 to 0 because 2&1 is 0
putbit(arr,6,!!x);  // sets bit 6 to 1 because !!2 is 1

编辑文档:

“dword”=“双字”= 32位值(无符号,但这并不重要)

RESERVE_BITS: number_of_bits --> number_of_dwords
    RESERVE_BITS(n) is the number of 32-bit integers enough to store n bits
DW_INDEX: bit_index_in_array --> dword_index_in_array
    DW_INDEX(i) is the index of dword where the i-th bit is stored.
    Both bit and dword indexes start from 0.
BIT_INDEX: bit_index_in_array --> bit_index_in_dword
    If i is the number of some bit in the array, BIT_INDEX(i) is the number
    of that bit in the dword where the bit is stored.
    And the dword is known via DW_INDEX().
getbit: bit_array, bit_index_in_array --> bit_value
putbit: bit_array, bit_index_in_array, bit_value --> 0

getbit (array,i) 获取包含位 i 的双字,并将双字右移,使位 i 成为最低有效位。然后,按位与 1 清除所有其他位。

putbit(array, i, v) 首先检查 v 的最低有效位;如果它是 0,我们必须清除该位,如果它是 1,我们必须设置它。
为了设置该位,我们对包含该位的双字进行按位或操作,并将 1 的值左移 by bit_index_in_dword:该位被设置,其他位也被设置不改变。
为了清除该位,我们对包含该位的双字进行按位与操作,并按 bit_index_in_dword 对 1 左移进行按位补码: value 将所有位设置为 1,除了我们要清除的位置中唯一的零位。
该宏以 , 0 结尾,否则它将返回存储位 i 的双字值,并且该值没有意义。还可以使用((void)0)

bitarray.h:

#include <inttypes.h> // defines uint32_t

//typedef unsigned int bitarray_t; // if you know that int is 32 bits
typedef uint32_t bitarray_t;

#define RESERVE_BITS(n) (((n)+0x1f)>>5)
#define DW_INDEX(x) ((x)>>5)
#define BIT_INDEX(x) ((x)&0x1f)
#define getbit(array,index) (((array)[DW_INDEX(index)]>>BIT_INDEX(index))&1)
#define putbit(array, index, bit) \
    ((bit)&1 ?  ((array)[DW_INDEX(index)] |= 1<<BIT_INDEX(index)) \
             :  ((array)[DW_INDEX(index)] &= ~(1<<BIT_INDEX(index))) \
             , 0 \
    )

Use:

bitarray_t arr[RESERVE_BITS(130)] = {0, 0x12345678,0xabcdef0,0xffff0000,0};
int i = getbit(arr,5);
putbit(arr,6,1);
int x=2;            // the least significant bit is 0
putbit(arr,6,x);    // sets bit 6 to 0 because 2&1 is 0
putbit(arr,6,!!x);  // sets bit 6 to 1 because !!2 is 1

EDIT the docs:

"dword" = "double word" = 32-bit value (unsigned, but that's not really important)

RESERVE_BITS: number_of_bits --> number_of_dwords
    RESERVE_BITS(n) is the number of 32-bit integers enough to store n bits
DW_INDEX: bit_index_in_array --> dword_index_in_array
    DW_INDEX(i) is the index of dword where the i-th bit is stored.
    Both bit and dword indexes start from 0.
BIT_INDEX: bit_index_in_array --> bit_index_in_dword
    If i is the number of some bit in the array, BIT_INDEX(i) is the number
    of that bit in the dword where the bit is stored.
    And the dword is known via DW_INDEX().
getbit: bit_array, bit_index_in_array --> bit_value
putbit: bit_array, bit_index_in_array, bit_value --> 0

getbit(array,i) fetches the dword containing the bit i and shifts the dword right, so that the bit i becomes the least significant bit. Then, a bitwise and with 1 clears all other bits.

putbit(array, i, v) first of all checks the least significant bit of v; if it is 0, we have to clear the bit, and if it is 1, we have to set it.
To set the bit, we do a bitwise or of the dword that contains the bit and the value of 1 shifted left by bit_index_in_dword: that bit is set, and other bits do not change.
To clear the bit, we do a bitwise and of the dword that contains the bit and the bitwise complement of 1 shifted left by bit_index_in_dword: that value has all bits set to one except the only zero bit in the position that we want to clear.
The macro ends with , 0 because otherwise it would return the value of dword where the bit i is stored, and that value is not meaningful. One could also use ((void)0).

最美的太阳 2024-09-02 08:00:12

这是一个权衡:

(1) 每个 2 位值使用 1 个字节 - 简单、快速,但使用 4 倍内存

(2) 将位打包为字节 - 更复杂,一些性能开销,使用最小内存

如果您有足够的可用内存然后选择(1),否则考虑(2)。

It's a trade-off:

(1) use 1 byte for each 2 bit value - simple, fast, but uses 4x memory

(2) pack bits into bytes - more complex, some performance overhead, uses minimum memory

If you have enough memory available then go for (1), otherwise consider (2).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文