使用 LINQ 在字节数组中搜索以特定字节开始/结束的所有子数组

发布于 2024-10-10 12:40:52 字数 533 浏览 3 评论 0原文

我正在处理一个 COM 端口应用程序,我们有一个定义的可变长度数据包结构,我正在用它与微控制器通信。数据包具有起始字节和停止字节的分隔符。问题是有时读取缓冲区可能包含无关字符。看起来我总是会得到整个数据包,只是在实际数据之前/之后有一些额外的喋喋不休。因此,我有一个缓冲区,每当从 COM 端口接收到新数据时,我都会将数据附加到该缓冲区。搜索此缓冲区以查找可能出现的数据包的最佳方法是什么?例如:

假设我的数据包分隔符是 0xFF 并且我有一个这样的数组,

{ 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 }

如何创建一个函数/LINQ 语句来返回以分隔符开头和结尾的所有子数组(几乎就像一个滑动-带通配符的相关器)?

该示例将返回以下 3 个数组:

{0xFF, 0x02, 0xDA, 0xFF}, {0xFF, 0x55, 0xFF}, and
{0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF}

I'm dealing with a COM port application and we have a defined variable-length packet structure that I'm talking to a micro-controller with. The packet has delimiters for the start and stop bytes. The trouble is that sometimes the read buffer can contain extraneous characters. It seems like I'll always get the whole packet, just some extra chatter before/after the actual data. So I have a buffer that I append data to whenever new data is received from the COM port. What is the best way to search this buffer for any possible occurrences of my packet? For example:

Say my packet delimiter is 0xFF and I have an array as such

{ 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 }

How can I create a function/LINQ-statment that returns all subarrays that start and end with the delimiter (almost like a sliding-correlator with wildcards)?

The sample would return the following 3 arrays:

{0xFF, 0x02, 0xDA, 0xFF}, {0xFF, 0x55, 0xFF}, and
{0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

百合的盛世恋 2024-10-17 12:40:52

虽然特里斯坦的答案在技术上是正确的,但他同时制作了原始数组的大量副本。如果起始数组很大并且有一堆分隔符,那么它很快就会变得很大。这种方法通过仅使用原始数组和当前正在评估的段的数组来避免大量内存消耗。

public static List<ArraySegment<byte>> GetSubArrays(this byte[] array, byte delimeter)
{
    if (array == null) throw new ArgumentNullException("array");

    List<ArraySegment<byte>> retval = new List<ArraySegment<byte>>();

    for (int i = 0; i < array.Length; i++)
    {
        if (array[i] == delimeter)
        {
            for (int j = i + 1; j < array.Length; j++)
            {
                if (array[j] == delimeter)
                {
                    retval.Add(new ArraySegment<byte>(array, i + 1, j - i - 1));
                }
            }
        }
    }

    return retval;
}

可以这样使用:

static void Main(string[] args)
{
    byte[] arr = new byte[] { 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 };
    List<ArraySegment<byte>> retval = GetSubArrays(arr, 0xFF);

    // this also works (looks like LINQ):
    //List<ArraySegment<byte>> retval = arr.GetSubArrays(0xFF);

    byte[] buffer = new byte[retval.Select(x => x.Count).Max()];
    foreach (var x in retval)
    {
        Buffer.BlockCopy(x.Array, x.Offset, buffer, 0, x.Count);
        Console.WriteLine(String.Join(", ", buffer.Take(x.Count).Select(b => b.ToString("X2")).ToArray()));
    }


    Console.ReadLine();
}

While Trystan's answer is technically correct, he's making lots of copies of the original array all at once. If the starting array is large and has a bunch of delimiters, that gets huge quickly. This approach avoids the massive memory consumption by using only the original array and an array for the current segment being evaluated.

public static List<ArraySegment<byte>> GetSubArrays(this byte[] array, byte delimeter)
{
    if (array == null) throw new ArgumentNullException("array");

    List<ArraySegment<byte>> retval = new List<ArraySegment<byte>>();

    for (int i = 0; i < array.Length; i++)
    {
        if (array[i] == delimeter)
        {
            for (int j = i + 1; j < array.Length; j++)
            {
                if (array[j] == delimeter)
                {
                    retval.Add(new ArraySegment<byte>(array, i + 1, j - i - 1));
                }
            }
        }
    }

    return retval;
}

Can be used as such:

static void Main(string[] args)
{
    byte[] arr = new byte[] { 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 };
    List<ArraySegment<byte>> retval = GetSubArrays(arr, 0xFF);

    // this also works (looks like LINQ):
    //List<ArraySegment<byte>> retval = arr.GetSubArrays(0xFF);

    byte[] buffer = new byte[retval.Select(x => x.Count).Max()];
    foreach (var x in retval)
    {
        Buffer.BlockCopy(x.Array, x.Offset, buffer, 0, x.Count);
        Console.WriteLine(String.Join(", ", buffer.Take(x.Count).Select(b => b.ToString("X2")).ToArray()));
    }


    Console.ReadLine();
}
寂寞笑我太脆弱 2024-10-17 12:40:52

以下是如何使用 LINQ ...

int[] list = new int[] { 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 };
int MAXLENGTH = 10;

var windows = list.Select((element, i) => list.Skip(i).Take(MAXLENGTH));
var matched = windows.Where(w => w.First() == 0xFF);
var allcombinations = matched.SelectMany(m => Enumerable.Range(1, m.Count())
          .Select(i => m.Take(i)).Where(x => x.Count() > 2 && x.Last() == 0xFF));

或使用索引来做到这一点:

int length = list.Count();
var indexes = Enumerable.Range(0, length)
              .SelectMany(i => Enumerable.Range(3, Math.Min(length-i, MAXLENGTH))
              .Select(count => new {i, count}));
var results = indexes.Select(index => list.Skip(index.i).Take(index.count))
              .Where(x => x.First() == 0xFF && x.Last() == 0xFF);

Here's how you can do this using LINQ ...

int[] list = new int[] { 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 };
int MAXLENGTH = 10;

var windows = list.Select((element, i) => list.Skip(i).Take(MAXLENGTH));
var matched = windows.Where(w => w.First() == 0xFF);
var allcombinations = matched.SelectMany(m => Enumerable.Range(1, m.Count())
          .Select(i => m.Take(i)).Where(x => x.Count() > 2 && x.Last() == 0xFF));

Or using indexes:

int length = list.Count();
var indexes = Enumerable.Range(0, length)
              .SelectMany(i => Enumerable.Range(3, Math.Min(length-i, MAXLENGTH))
              .Select(count => new {i, count}));
var results = indexes.Select(index => list.Skip(index.i).Take(index.count))
              .Where(x => x.First() == 0xFF && x.Last() == 0xFF);
红尘作伴 2024-10-17 12:40:52

如果您确实想使用 LINQ,那么它应该运行得相当快(即使不如旧的 for 循环那么快):

public static IEnumerable<T[]> GetPackets<T>(this IList<T> buffer, T delimiter)
{
    // gets delimiters' indexes
    var delimiterIdxs = Enumerable.Range(0, buffer.Count())
                                  .Where(i => buffer[i].Equals(delimiter))
                                  .ToArray();

    // creates a list of delimiters' indexes pair (startIdx,endIdx)
    var dlmtrIndexesPairs = delimiterIdxs.Take(delimiterIdxs.Count() - 1)
                                         .SelectMany(
                                                     (startIdx, idx) => 
                                                     delimiterIdxs.Skip(idx + 1)
                                                                  .Select(endIdx => new { startIdx, endIdx })
                                                    );
    // creates array of packets
    var packets = dlmtrIndexesPairs.Select(p => buffer.Skip(p.startIdx)
                                                      .Take(p.endIdx - p.startIdx + 1)
                                                      .ToArray())
                                   .ToArray();

    return packets;
}

If you really want to use LINQ, this should work quite fast (even if not as fast as a good-old for loop):

public static IEnumerable<T[]> GetPackets<T>(this IList<T> buffer, T delimiter)
{
    // gets delimiters' indexes
    var delimiterIdxs = Enumerable.Range(0, buffer.Count())
                                  .Where(i => buffer[i].Equals(delimiter))
                                  .ToArray();

    // creates a list of delimiters' indexes pair (startIdx,endIdx)
    var dlmtrIndexesPairs = delimiterIdxs.Take(delimiterIdxs.Count() - 1)
                                         .SelectMany(
                                                     (startIdx, idx) => 
                                                     delimiterIdxs.Skip(idx + 1)
                                                                  .Select(endIdx => new { startIdx, endIdx })
                                                    );
    // creates array of packets
    var packets = dlmtrIndexesPairs.Select(p => buffer.Skip(p.startIdx)
                                                      .Take(p.endIdx - p.startIdx + 1)
                                                      .ToArray())
                                   .ToArray();

    return packets;
}
夜访吸血鬼 2024-10-17 12:40:52

我不会尝试使用 linq 来执行此操作,因此这里有一个常规方法,它返回与您想要的相同的输出。

public List<byte[]> GetSubArrays(byte[] array, byte delimeter)
{
  if (array == null) throw new ArgumentNullException("array");

  List<byte[]> subArrays = new List<byte[]>();

  for (int i = 0; i < array.Length; i++)
  {
    if (array[i] == delimeter && i != array.Length - 1)
    {
      List<byte> subList = new List<byte>() { delimeter };

      for (int j = i+1; j < array.Length; j++)
      {
        subList.Add(array[j]);
        if (array[j] == delimeter)
        {
          subArrays.Add(subList.ToArray());
        }
      }
    }
  }

  return subArrays;
}

如果它必须是就地 lambda 表达式,则只需将第一行更改为 (byte[] array, byte delimeter) => (不带方法修饰符和名称)并以这种方式调用它。

I wouldn't try to do this with linq so here's a regular method that returns the same output as you wanted.

public List<byte[]> GetSubArrays(byte[] array, byte delimeter)
{
  if (array == null) throw new ArgumentNullException("array");

  List<byte[]> subArrays = new List<byte[]>();

  for (int i = 0; i < array.Length; i++)
  {
    if (array[i] == delimeter && i != array.Length - 1)
    {
      List<byte> subList = new List<byte>() { delimeter };

      for (int j = i+1; j < array.Length; j++)
      {
        subList.Add(array[j]);
        if (array[j] == delimeter)
        {
          subArrays.Add(subList.ToArray());
        }
      }
    }
  }

  return subArrays;
}

If it must be an in-place lambda expression, then just change the first line to (byte[] array, byte delimeter) => (without the method modifiers and name) and call it that way.

压抑⊿情绪 2024-10-17 12:40:52

尽管分隔符结构看起来有点模糊,但我不会使用 linq 并执行如下所示的操作(未执行大量测试)。它将返回所有子集(由分隔符包围的字节),而不包含分隔符(无论如何它都是给定的,为什么要包含它?)。它也不会返回结果的并集,但始终可以手动组装。

public IEnumerable<byte[]> GetArrays(byte[] data, byte delimiter)
{
    List<byte[]> arrays = new List<byte[]>();
    int start = 0;
    while (start >= 0 && (start = Array.IndexOf<byte>(data, delimiter, start)) >= 0)
    {
        start++;
        if (start >= data.Length - 1)
        {
            break;
        }

        int end = Array.IndexOf<byte>(data, delimiter, start);
        if (end < 0)
        {
            break;
        }

        byte[] sub = new byte[end - start];
        Array.Copy(data, start, sub, 0, end - start);
        arrays.Add(sub);
        start = end;
    }

    return arrays;
}

Although the delimiter structure seems a bit vague, I would not use linq and do something like below (no extensive tests performed). It will return all subsets (of bytes surrounded by the delimiter), without including the delimiter (it's a given anyway, why include it?). It also does not return the union of the results, but that can always be assembled manually.

public IEnumerable<byte[]> GetArrays(byte[] data, byte delimiter)
{
    List<byte[]> arrays = new List<byte[]>();
    int start = 0;
    while (start >= 0 && (start = Array.IndexOf<byte>(data, delimiter, start)) >= 0)
    {
        start++;
        if (start >= data.Length - 1)
        {
            break;
        }

        int end = Array.IndexOf<byte>(data, delimiter, start);
        if (end < 0)
        {
            break;
        }

        byte[] sub = new byte[end - start];
        Array.Copy(data, start, sub, 0, end - start);
        arrays.Add(sub);
        start = end;
    }

    return arrays;
}
つ低調成傷 2024-10-17 12:40:52

您可以使用 Linq 聚合器来完成此操作,但它比此处建议的其他解决方案要简单得多,还必须添加一个特殊情况来覆盖扩展已经完成的数组,如您上面所建议的。

byte[] myArray = new byte[] { 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 };
var arrayList = myArray.Aggregate(
                new { completedLists = new List<List<byte>>(), 
                      activeList = new List<byte>() },
                (seed, s) =>
                {
                    if (s == 0xFF)
                    {
                        if (seed.activeList.Count == 0)
                        {
                            seed.activeList.Add(s);
                        }
                        else
                        {
                            seed.activeList.Add(s);
                            var combinedLists = new List<List<byte>>();

                            foreach (var l in seed.completedLists)
                            {
                                var combinedList = new List<byte>(l);
                                combinedList.AddRange(seed.activeList.Skip(1));
                                combinedLists.Add(combinedList);
                            }
                            seed.completedLists.AddRange(combinedLists);
                            seed.completedLists.Add(new List<byte>(seed.activeList));
                            seed.activeList.Clear();
                            seed.activeList.Add(s);
                        }
                    }
                    else
                    {
                        if (seed.activeList.Count > 0)
                            seed.activeList.Add(s);
                    }
                    return seed;
                }).completedLists;

You could do this using a Linq aggregator, but it's much less straightforward than the other solutions suggested here, also had to add a special case to cover extending already completed arrays as you suggested above.

byte[] myArray = new byte[] { 0x00, 0xFF, 0x02, 0xDA, 0xFF, 0x55, 0xFF, 0x04 };
var arrayList = myArray.Aggregate(
                new { completedLists = new List<List<byte>>(), 
                      activeList = new List<byte>() },
                (seed, s) =>
                {
                    if (s == 0xFF)
                    {
                        if (seed.activeList.Count == 0)
                        {
                            seed.activeList.Add(s);
                        }
                        else
                        {
                            seed.activeList.Add(s);
                            var combinedLists = new List<List<byte>>();

                            foreach (var l in seed.completedLists)
                            {
                                var combinedList = new List<byte>(l);
                                combinedList.AddRange(seed.activeList.Skip(1));
                                combinedLists.Add(combinedList);
                            }
                            seed.completedLists.AddRange(combinedLists);
                            seed.completedLists.Add(new List<byte>(seed.activeList));
                            seed.activeList.Clear();
                            seed.activeList.Add(s);
                        }
                    }
                    else
                    {
                        if (seed.activeList.Count > 0)
                            seed.activeList.Add(s);
                    }
                    return seed;
                }).completedLists;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文