寻找特定的一对“10”或“01”在字符数组中

发布于 2024-12-11 04:33:50 字数 1579 浏览 0 评论 0原文

这可能是一个有点理论性的问题。我有一个包含网络数据包的字节字符数组。我想检查每 66 位是否出现一对特定的位(“01”或“10”)。也就是说,一旦找到第一对位,我就必须跳过 66 位并再次检查同一对位是否存在。我正在尝试实现一个带有掩码和轮班的程序,但它有点变得复杂。我想知道是否有人可以提出更好的方法来完成同样的事情。

到目前为止我写的代码看起来像这样。但它并不完整。

test_sync_bits(char *rec, int len)
{
        uint8_t target_byte = 0;
    int offset = 0;
    int save_offset = 0;

    uint8_t *pload = (uint8_t*)(rec + 24);
    uint8_t seed_mask = 0xc0;
    uint8_t seed_shift = 6;
    uint8_t value = 0;
    uint8_t found_sync = 0;
    const uint8_t sync_bit_spacing = 66;

    /*hunt for the first '10' or '01' combination.*/
    target_byte = *(uint8_t*)(pload + offset);
    /*Get all combinations of two bits from target byte.*/
    while(seed_shift)
    {
        value = ((target_byte & seed_mask) >> seed_shift);
        if((value == 0x01) || (value == 0x10))
        {
          save_offset = offset;
          found_sync = 1;
          break;
        }
        else
        {
         seed_mask = (seed_mask >> 2) ;
         seed_shift-=2;
        }  
    }
    offset = offset + 8;
    seed_shift = (seed_shift - 4) > 0 ? (seed_shift - 4) : (seed_shift + 8 - 4);
    seed_mask = (seed_mask >> (6 - seed_shift));
}

我提出的另一个想法是使用下面定义的结构

typedef struct
{
    int remainder_bits;
    int extra_bits;
    int extra_byte;
}remainder_bits_extra_bits_map_t;

static remainder_bits_extra_bits_map_t sync_bit_check [] =
{
    {6, 4, 0},
    {5, 5, 0},
    {4, 6, 0},
    {3, 7, 0},
    {2, 8, 0},
    {1, 1, 1},
    {0, 2, 1},
};

我的方法正确吗?有人可以提出任何改进吗?

This may be a slightly theoretical question. I have a char array of bytes containing network packets. I want to check for the occurrence of a particular pair of bits ('01' or '10')every 66 bits. That is to say once I locate the first pair of bits I have to skip 66 bits and check the presence of same pair of bits again. I am trying to implement a program with masks and shifts and it is kind of getting complicated. I want to know if someone can suggest a better way to do the same thing.

The code I have written so far looks something like this. It is not complete though.

test_sync_bits(char *rec, int len)
{
        uint8_t target_byte = 0;
    int offset = 0;
    int save_offset = 0;

    uint8_t *pload = (uint8_t*)(rec + 24);
    uint8_t seed_mask = 0xc0;
    uint8_t seed_shift = 6;
    uint8_t value = 0;
    uint8_t found_sync = 0;
    const uint8_t sync_bit_spacing = 66;

    /*hunt for the first '10' or '01' combination.*/
    target_byte = *(uint8_t*)(pload + offset);
    /*Get all combinations of two bits from target byte.*/
    while(seed_shift)
    {
        value = ((target_byte & seed_mask) >> seed_shift);
        if((value == 0x01) || (value == 0x10))
        {
          save_offset = offset;
          found_sync = 1;
          break;
        }
        else
        {
         seed_mask = (seed_mask >> 2) ;
         seed_shift-=2;
        }  
    }
    offset = offset + 8;
    seed_shift = (seed_shift - 4) > 0 ? (seed_shift - 4) : (seed_shift + 8 - 4);
    seed_mask = (seed_mask >> (6 - seed_shift));
}

Another idea I came up with was to use a structure defined below

typedef struct
{
    int remainder_bits;
    int extra_bits;
    int extra_byte;
}remainder_bits_extra_bits_map_t;

static remainder_bits_extra_bits_map_t sync_bit_check [] =
{
    {6, 4, 0},
    {5, 5, 0},
    {4, 6, 0},
    {3, 7, 0},
    {2, 8, 0},
    {1, 1, 1},
    {0, 2, 1},
};

Is my approach correct? Can anyone suggest any improvements for the same?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

我不是你的备胎 2024-12-18 04:33:50

查找表想法

只有 256 个可能的字节。这足够少,您可以构建一个一个字节中可能发生的所有可能位组合的查找表。

查找表值可以记录模式的位位置,并且它还可以具有标记可能的连续开始或连续结束值的特殊值。

编辑:

我认为连续值是愚蠢的。相反,要检查与字节重叠的模式,请从另一个字节移动该字节并在位中进行“或”操作,或者手动检查每个字节的结束位。也许 ((bytes[i] & 0x01) & (bytes[i+1] & 0x80)) == 0x80((bytes[i] & 0x01) & ; (bytes[i+1] & 0x80)) == 0x01 适合你。

您没有这么说,我还假设您正在寻找任何字节中的第一个匹配项。如果您正在查找每个匹配项,然后检查 +66 位的结束模式,那就是另一个问题了。

为了创建查找表,我会编写一个程序来帮我完成它。它可以是您最喜欢的脚本语言,也可以是 C 语言。该程序将编写一个类似于以下内容的文件:

/* each value is the bit position of a possible pattern OR'd with a pattern ID bit. */
/* 0 is no match */
#define P_01 0x00
#define P_10 0x10
const char byte_lookup[256] = {
    /*  0: 0000_0000, 0000_0001, 0000_0010, 0000_0011 */
                   0,    2|P_01,    3|P_01,    3|P_01,
    /*  4: 0000_0100, 0000_0101, 0000_0110, 0000_0111, */
              4|P_01,    4|P_01,    4|P_01,    4|P_01,
    /*  8: 0000_1000, 0000_1001, 0000_1010, 0000_1011, */
              5|P_01,    5|P_01,    5|P_01,    5|P_01,
};

Tedious。这就是为什么我会编写一个程序来为我编写它。

Lookup Table Idea

There are only 256 possible bytes. That is few enough that you can construct a lookup table of all the possible bit combinations that can happen in one byte.

The lookup table value could record the bit position of the pattern and it could also have special values that mark possible continuation start or continuation finish values.

Edit:

I decided that continuation values would be silly. Instead, to check for a pattern that overlaps a byte, shift the byte and OR in the bit from the other byte, or manually check the end bits at each byte. Maybe ((bytes[i] & 0x01) & (bytes[i+1] & 0x80)) == 0x80 and ((bytes[i] & 0x01) & (bytes[i+1] & 0x80)) == 0x01 would work for you.

You didn't say so I am also assuming that you are looking for the first match in any byte. If you are looking for every match, then checking for the end pattern at +66 bits, that's a different problem.

To create the lookup table, I would write a program to do it for me. It could be in your favorite script language or it could be in C. The program would write a file that looked something like:

/* each value is the bit position of a possible pattern OR'd with a pattern ID bit. */
/* 0 is no match */
#define P_01 0x00
#define P_10 0x10
const char byte_lookup[256] = {
    /*  0: 0000_0000, 0000_0001, 0000_0010, 0000_0011 */
                   0,    2|P_01,    3|P_01,    3|P_01,
    /*  4: 0000_0100, 0000_0101, 0000_0110, 0000_0111, */
              4|P_01,    4|P_01,    4|P_01,    4|P_01,
    /*  8: 0000_1000, 0000_1001, 0000_1010, 0000_1011, */
              5|P_01,    5|P_01,    5|P_01,    5|P_01,
};

Tedious. That's why I would write a program to write it for me.

一江春梦 2024-12-18 04:33:50

这是从流读取时经常出现的经典去块问题的变体。也就是说,数据以离散单位形式出现,与您希望扫描的单位大小不匹配。其中的挑战是 1) 缓冲(这不会影响您,因为您可以访问整个数组)和 2) 管理所有状态(正如您发现的)。一个好的方法是编写一个消费者函数,其行为类似于 fread() 和 fseek() 来维护自己的状态。它返回您感兴趣的请求数据,并与您提供的缓冲区正确对齐。

This is a variation of the classic de-blocking problem that often comes up when reading from a stream. That is, data comes in discrete units that don't match up to the unit size that you wish to scan. The challenges in this are 1) buffering (which doesn't affect you because you have access to the whole array) and 2) managing all of the state (as you found out). A good approach is to write a consumer function that acts something like fread() and fseek() which maintains its own state. It returns the requested data you're interested in, aligned properly to the buffers you give it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文