如何更高效地遍历存储{int,short,ushort,...}的字符数组?

发布于 2024-12-16 14:34:18 字数 1310 浏览 0 评论 0原文

我有一个 char data[len],它由从二进制文件读取的解压缩数据填充。 我知道 data 只能是以下类型:char、uchar、short、ushort、int、uint、float、double 我知道所需的确切位数表示 (elesize = {8, 16, 32, 64})。

我只想遍历数据列表,然后找到 max()min() 或给定数字的出现次数。我想在不创建另一个数组来解决内存空间问题的情况下执行此操作。

我想出了以下方法,但它很慢,例如 len == 34560000

所以我想知道是否有人有“one-liner”或更有效的方法来做到这一点(C 或C++)。

char data[len];
double mymax = -std::numeric_limits<double>::max()
for (size_t i=0; i<len; i += elesize)
{
    double x;
    char *r = data+i;
    if (elementtype == "char")
        x = static_cast<double>(*r);
    else if (elementtype == "uchar")
        x = static_cast<double>(*((unsigned char *)r));
    else if (elementtype == "short")
        x = static_cast<double>(*((int16_t *)r));
    else if (elementtype == "ushort")
        x = static_cast<double>(*((uint16_t *)r));
    else if (elementtype == "int")
        x = static_cast<double>(*((int32_t *)r));
    else if (elementtype == "uint")
        x = static_cast<double>(*((uint32_t *)r));
    else if (elementtype == "float")
        x = static_cast<double>(*((float *)r));
    else if (elementtype == "double")
        x = *((double *)r);
    if (x > mymax)
        mymax = x;
}

I have a char data[len] populated from unzipped data that is read off of a binary file.
I know that data can only be of these types: char, uchar, short, ushort, int, uint, float, double for which I know exact number of bits needed to represent (elesize = {8, 16, 32, 64}).

I just want to traverse the data list and, say, find the max(), min() or number of occurrences of a given number. and I want to do this without creating another array for memory space concerns.

I have come up with the following but it is slow for example for len == 34560000

So I was wondering if anyone has a 'one-liner' or a more efficient way for doing this (either C or C++).

char data[len];
double mymax = -std::numeric_limits<double>::max()
for (size_t i=0; i<len; i += elesize)
{
    double x;
    char *r = data+i;
    if (elementtype == "char")
        x = static_cast<double>(*r);
    else if (elementtype == "uchar")
        x = static_cast<double>(*((unsigned char *)r));
    else if (elementtype == "short")
        x = static_cast<double>(*((int16_t *)r));
    else if (elementtype == "ushort")
        x = static_cast<double>(*((uint16_t *)r));
    else if (elementtype == "int")
        x = static_cast<double>(*((int32_t *)r));
    else if (elementtype == "uint")
        x = static_cast<double>(*((uint32_t *)r));
    else if (elementtype == "float")
        x = static_cast<double>(*((float *)r));
    else if (elementtype == "double")
        x = *((double *)r);
    if (x > mymax)
        mymax = x;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

戴着白色围巾的女孩 2024-12-23 14:34:18

模板应该做得很好:

#include <algorithm>

template <typename T>
T read_and_advance(const unsigned char * & p)
{
  T x;
  unsigned char * const px = reinterpret_cast<unsigned char *>(&x);

  std::copy(p, p + sizeof(T), px);
  P += sizeof(T);

  return x;
}

用法:

const unsigned char * p = the_data;
unsigned int max = 0;

while (p != the_data + data_length)
{
  max = std::max(max, read_and_advance<unsigned int>(p));
}

废弃这个,我以为最初问题是针对 C 的。

这是一个宏:

#define READ_TYPE(T, buf, res) do { memcpy(&res, buf, sizeof(T)); buf += sizeof(T); } while (false)

用法:

int max = 0;
unsigned char * p = data;

while (true)
{
  unsigned int res;
  READ_TYPE(unsigned int, p, res);
  if (res > max) max = res;
}

你并没有真正绕过指定 < em>类型,不过。在 C++ 中,这可以做得更优雅一些。

或者,您可以将其全部包装在一个中:

#define READ_TYPE_AND_MAX(T, buf, max)  \
  do { T x; memcpy(&x, buf, sizeof(T)); \
       buf += sizeof(T);                \
       if (max < x) max = x;            \
  } while (false)

// Usage:
unsigned int max = 0;
unsigned char * p = data;
while (true) { READ_TYPE_AND_MAX(unsigned int, p, max); }

A template should do nicely:

#include <algorithm>

template <typename T>
T read_and_advance(const unsigned char * & p)
{
  T x;
  unsigned char * const px = reinterpret_cast<unsigned char *>(&x);

  std::copy(p, p + sizeof(T), px);
  P += sizeof(T);

  return x;
}

Usage:

const unsigned char * p = the_data;
unsigned int max = 0;

while (p != the_data + data_length)
{
  max = std::max(max, read_and_advance<unsigned int>(p));
}

Scrap this, I thought originally the question was for C.

Here's a macro:

#define READ_TYPE(T, buf, res) do { memcpy(&res, buf, sizeof(T)); buf += sizeof(T); } while (false)

Usage:

int max = 0;
unsigned char * p = data;

while (true)
{
  unsigned int res;
  READ_TYPE(unsigned int, p, res);
  if (res > max) max = res;
}

You don't really get around specifying the type, though. In C++ this could be done a bit more elegantly.

Alternatively you can wrap it all in one:

#define READ_TYPE_AND_MAX(T, buf, max)  \
  do { T x; memcpy(&x, buf, sizeof(T)); \
       buf += sizeof(T);                \
       if (max < x) max = x;            \
  } while (false)

// Usage:
unsigned int max = 0;
unsigned char * p = data;
while (true) { READ_TYPE_AND_MAX(unsigned int, p, max); }

月依秋水 2024-12-23 14:34:18

鉴于 elementtype 是循环不变的,您最好只在 for 之外进行一次比较。顺便说一句,我希望 elementtypestd::string 类型或与字符串文字进行有意义比较的类型。

最终,我将编写一个模板函数来执行整个处理循环,然后根据 elementtype 使用适当的模板参数调用它。

Given that elementtype is loop-invariant, you would better do the comparison only once outside the for. By the way, I hope elementtype is of type std::string or something that meaningfully compares to string literals.

Ultimately, I would write a template function that does the whole processing loop and then call it with the appropiate template argument according to elementtype.

财迷小姐 2024-12-23 14:34:18

将条件代码放在循环之外,因此循环运行得很快。尝试这样的事情:

char data[len];
double mymax = -std::numeric_limits<double>::max()
double x;
if (elementtype == "char") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*r);
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "uchar") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*((unsigned char *)r));
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "short")

..etc..etc

Put the conditional code outside the loop, so the loop runs fast. Try something like this:

char data[len];
double mymax = -std::numeric_limits<double>::max()
double x;
if (elementtype == "char") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*r);
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "uchar") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*((unsigned char *)r));
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "short")

..etc..etc
假面具 2024-12-23 14:34:18

正如其他人指出的,您应该只检查类型一次。然后您应该调用仅处理一种类型的适当子函数。当元素类型不是双精度时,您也不应该转换为双精度来与 my_max 进行比较。否则,您将不必要地转换为双精度并与双精度进行比较。如果 elementtype 是 uint,那么您永远不应该将任何内容转换为 double,只需与也是 uint 的 my_max var 进行比较。

As others indicated, you should check the type only once. Then you should call appropriate sub-function that only deals with one type. You should also not be casting to doubles for comparing to my_max when the elementtype is not double. Otherwise you are needlessly converting to double and doing comparisons with doubles. If elementtype is uint, then you should never be converting anything to double, just compare with a my_max var that is also uint.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文