如何更高效地遍历存储{int,short,ushort,...}的字符数组？

发布于 2024-12-16 14:34:18 字数 1310 浏览 0 评论 0原文

我有一个 char data[len]，它由从二进制文件读取的解压缩数据填充。我知道 data 只能是以下类型：char、uchar、short、ushort、int、uint、float、double 我知道所需的确切位数表示 (elesize = {8, 16, 32, 64})。

我只想遍历数据列表，然后找到 max()、min() 或给定数字的出现次数。我想在不创建另一个数组来解决内存空间问题的情况下执行此操作。

我想出了以下方法，但它很慢，例如 len == 34560000

所以我想知道是否有人有“one-liner”或更有效的方法来做到这一点（C 或C++）。

char data[len];
double mymax = -std::numeric_limits<double>::max()
for (size_t i=0; i<len; i += elesize)
{
    double x;
    char *r = data+i;
    if (elementtype == "char")
        x = static_cast<double>(*r);
    else if (elementtype == "uchar")
        x = static_cast<double>(*((unsigned char *)r));
    else if (elementtype == "short")
        x = static_cast<double>(*((int16_t *)r));
    else if (elementtype == "ushort")
        x = static_cast<double>(*((uint16_t *)r));
    else if (elementtype == "int")
        x = static_cast<double>(*((int32_t *)r));
    else if (elementtype == "uint")
        x = static_cast<double>(*((uint32_t *)r));
    else if (elementtype == "float")
        x = static_cast<double>(*((float *)r));
    else if (elementtype == "double")
        x = *((double *)r);
    if (x > mymax)
        mymax = x;
}

原文

I have a char data[len] populated from unzipped data that is read off of a binary file.
I know that data can only be of these types: char, uchar, short, ushort, int, uint, float, double for which I know exact number of bits needed to represent (elesize = {8, 16, 32, 64}).

I just want to traverse the data list and, say, find the max(), min() or number of occurrences of a given number. and I want to do this without creating another array for memory space concerns.

I have come up with the following but it is slow for example for len == 34560000

So I was wondering if anyone has a 'one-liner' or a more efficient way for doing this (either C or C++).

char data[len];
double mymax = -std::numeric_limits<double>::max()
for (size_t i=0; i<len; i += elesize)
{
    double x;
    char *r = data+i;
    if (elementtype == "char")
        x = static_cast<double>(*r);
    else if (elementtype == "uchar")
        x = static_cast<double>(*((unsigned char *)r));
    else if (elementtype == "short")
        x = static_cast<double>(*((int16_t *)r));
    else if (elementtype == "ushort")
        x = static_cast<double>(*((uint16_t *)r));
    else if (elementtype == "int")
        x = static_cast<double>(*((int32_t *)r));
    else if (elementtype == "uint")
        x = static_cast<double>(*((uint32_t *)r));
    else if (elementtype == "float")
        x = static_cast<double>(*((float *)r));
    else if (elementtype == "double")
        x = *((double *)r);
    if (x > mymax)
        mymax = x;
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

戴着白色围巾的女孩 2024-12-23 14:34:18

模板应该做得很好：

#include <algorithm>

template <typename T>
T read_and_advance(const unsigned char * & p)
{
  T x;
  unsigned char * const px = reinterpret_cast<unsigned char *>(&x);

  std::copy(p, p + sizeof(T), px);
  P += sizeof(T);

  return x;
}

用法：

const unsigned char * p = the_data;
unsigned int max = 0;

while (p != the_data + data_length)
{
  max = std::max(max, read_and_advance<unsigned int>(p));
}

废弃这个，我以为最初问题是针对 C 的。

这是一个宏：
#define READ_TYPE(T, buf, res) do { memcpy(&res, buf, sizeof(T)); buf += sizeof(T); } while (false)
用法：
int max = 0; unsigned char * p = data; while (true) { unsigned int res; READ_TYPE(unsigned int, p, res); if (res > max) max = res; }
你并没有真正绕过指定 < em>类型，不过。在 C++ 中，这可以做得更优雅一些。
或者，您可以将其全部包装在一个中：
#define READ_TYPE_AND_MAX(T, buf, max) \ do { T x; memcpy(&x, buf, sizeof(T)); \ buf += sizeof(T); \ if (max < x) max = x; \ } while (false) // Usage: unsigned int max = 0; unsigned char * p = data; while (true) { READ_TYPE_AND_MAX(unsigned int, p, max); }

A template should do nicely:

#include <algorithm>

template <typename T>
T read_and_advance(const unsigned char * & p)
{
  T x;
  unsigned char * const px = reinterpret_cast<unsigned char *>(&x);

  std::copy(p, p + sizeof(T), px);
  P += sizeof(T);

  return x;
}

Usage:

const unsigned char * p = the_data;
unsigned int max = 0;

while (p != the_data + data_length)
{
  max = std::max(max, read_and_advance<unsigned int>(p));
}

Scrap this, I thought originally the question was for C.

Here's a macro:
#define READ_TYPE(T, buf, res) do { memcpy(&res, buf, sizeof(T)); buf += sizeof(T); } while (false)
Usage:
int max = 0; unsigned char * p = data; while (true) { unsigned int res; READ_TYPE(unsigned int, p, res); if (res > max) max = res; }
You don't really get around specifying the type, though. In C++ this could be done a bit more elegantly.
Alternatively you can wrap it all in one:
#define READ_TYPE_AND_MAX(T, buf, max) \ do { T x; memcpy(&x, buf, sizeof(T)); \ buf += sizeof(T); \ if (max < x) max = x; \ } while (false) // Usage: unsigned int max = 0; unsigned char * p = data; while (true) { READ_TYPE_AND_MAX(unsigned int, p, max); }

回复收藏 0 原文

月依秋水 2024-12-23 14:34:18

鉴于 elementtype 是循环不变的，您最好只在 for 之外进行一次比较。顺便说一句，我希望 elementtype 是 std::string 类型或与字符串文字进行有意义比较的类型。

最终，我将编写一个模板函数来执行整个处理循环，然后根据 elementtype 使用适当的模板参数调用它。

回复收藏 0 原文

财迷小姐 2024-12-23 14:34:18

将条件代码放在循环之外，因此循环运行得很快。尝试这样的事情：

char data[len];
double mymax = -std::numeric_limits<double>::max()
double x;
if (elementtype == "char") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*r);
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "uchar") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*((unsigned char *)r));
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "short")

..etc..etc

Put the conditional code outside the loop, so the loop runs fast. Try something like this:

char data[len];
double mymax = -std::numeric_limits<double>::max()
double x;
if (elementtype == "char") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*r);
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "uchar") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*((unsigned char *)r));
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "short")

..etc..etc

回复收藏 0 原文

假面具 2024-12-23 14:34:18

正如其他人指出的，您应该只检查类型一次。然后您应该调用仅处理一种类型的适当子函数。当元素类型不是双精度时，您也不应该转换为双精度来与 my_max 进行比较。否则，您将不必要地转换为双精度并与双精度进行比较。如果 elementtype 是 uint，那么您永远不应该将任何内容转换为 double，只需与也是 uint 的 my_max var 进行比较。

回复收藏 0 原文

~没有更多了~