memcmp排序

发布于 2024-07-13 23:04:20 字数 407 浏览 8 评论 0原文

我有一个缓冲区和几个指向它的指针。 我想根据指针指向的缓冲区中的字节对指针进行排序。

qsort() 和 stl::sort() 可以被赋予自定义比较函数。 例如,如果缓冲区是零终止的,我可以使用 strcmp:

int my_strcmp(const void* a,const void* b) {
  const char* const one = *(const char**)a,
  const two = *(const char**)b;
  return ::strcmp(one,two);
}

但是,如果缓冲区不是零终止的,我必须使用需要长度参数的 memcmp() 。

有没有一种整洁、有效的方法可以在不使用全局变量的情况下将缓冲区的长度输入到我的比较函数中?

I have a single buffer, and several pointers into it. I want to sort the pointers based upon the bytes in the buffer they point at.

qsort() and stl::sort() can be given custom comparision functions. For example, if the buffer was zero-terminated I could use strcmp:

int my_strcmp(const void* a,const void* b) {
  const char* const one = *(const char**)a,
  const two = *(const char**)b;
  return ::strcmp(one,two);
}

however, if the buffer is not zero-terminated, I have to use memcmp() which requires a length parameter.

Is there a tidy, efficient way to get the length of the buffer into my comparision function without a global variable?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

帅气称霸 2024-07-20 23:04:20

使用 std::sort,您可以使用像这样的 Functor:

struct CompString {
    CompString(int len) : m_Len(len) {}
    bool operator<(const char *a, const char *b) const {
        return std::memcmp(a, b, m_Len);
    }
private:
    int m_Len;
};

然后您可以执行以下操作:

std::sort(begin(), end(), CompString(4)); // all strings are 4 chars long

编辑: 来自注释建议(我猜这两个字符串都在公共缓冲区中?):

struct CompString {
    CompString (const unsigned char* e) : end(e) {}
    bool operator()(const unsigned char *a, const unsigned char *b) const {
        return std::memcmp(a, b, std::min(end - a, end - b)) < 0;
    }
private:
    const unsigned char* const end;
};

With std::sort, you can use a Functor like this:

struct CompString {
    CompString(int len) : m_Len(len) {}
    bool operator<(const char *a, const char *b) const {
        return std::memcmp(a, b, m_Len);
    }
private:
    int m_Len;
};

Then you can do this:

std::sort(begin(), end(), CompString(4)); // all strings are 4 chars long

EDIT: from the comment suggestions (i guess both strings are in a common buffer?):

struct CompString {
    CompString (const unsigned char* e) : end(e) {}
    bool operator()(const unsigned char *a, const unsigned char *b) const {
        return std::memcmp(a, b, std::min(end - a, end - b)) < 0;
    }
private:
    const unsigned char* const end;
};
澉约 2024-07-20 23:04:20

对于 C 函数 qsort(),不,如果不使用全局变量,就无法将长度传递给比较函数,这意味着它不能以线程安全的方式完成。 有些系统有 qsort_r() 函数(r 代表可重入),它允许您传递额外的上下文参数,然后将其传递给比较函数:

int my_comparison_func(void *context, const void *a, const void *b)
{
    return memcmp(*(const void **)a, *(const void **)b, (size_t)context);
}

qsort_r(data, n, sizeof(void*), (void*)number_of_bytes_to_compare, &my_comparison_func);

With the C function qsort(), no, there is no way to pass the length to your comparison function without using a global variable, which means it can't be done in a thread-safe manner. Some systems have a qsort_r() function (r stands for reentrant) which allows you to pass an extra context parameter, which then gets passed on to your comparison function:

int my_comparison_func(void *context, const void *a, const void *b)
{
    return memcmp(*(const void **)a, *(const void **)b, (size_t)context);
}

qsort_r(data, n, sizeof(void*), (void*)number_of_bytes_to_compare, &my_comparison_func);
中二柚 2024-07-20 23:04:20

是否存在不能以空终止缓冲区的原因?

如果没有,由于您使用的是 C++,您可以编写自己的函数对象:

 struct MyStrCmp {
    MyStrCmp (int n): length(n) { }
    inline bool operator< (char *lhs, char *rhs) {
       return ::strcmp (lhs, rhs, length);
    }
    int length;
 };
 // ...
 std::sort (myList.begin (), myList.end (), MyStrCmp (STR_LENGTH));

Is there a reason you can't null-terminate your buffers?

If not, since you're using C++ you can write your own function object:

 struct MyStrCmp {
    MyStrCmp (int n): length(n) { }
    inline bool operator< (char *lhs, char *rhs) {
       return ::strcmp (lhs, rhs, length);
    }
    int length;
 };
 // ...
 std::sort (myList.begin (), myList.end (), MyStrCmp (STR_LENGTH));
熊抱啵儿 2024-07-20 23:04:20

您可以将缓冲区指针 + 长度打包到一个结构中,并将该结构的指针作为 void * 传递吗?

Can you pack your buffer pointer + length into a structure and pass a pointer of that structure as void *?

看透却不说透 2024-07-20 23:04:20

您可以使用如下 hack:

int buffcmp(const void *b1, const void *b2)
{
    static int bsize=-1;
    if(b2==NULL) {bsize=*(int*)(b1); return 0;}
    return memcmp(b1, b2, idsize);
}

首先将其调用为 buffcmp(&bsize, NULL),然后将其作为比较函数传递给 qsort

当然,您可以通过添加更多 if 语句,使比较在 buffcmp(NULL, NULL) 等情况下表现得更自然。

You could use a hack like:

int buffcmp(const void *b1, const void *b2)
{
    static int bsize=-1;
    if(b2==NULL) {bsize=*(int*)(b1); return 0;}
    return memcmp(b1, b2, idsize);
}

which you would first call as buffcmp(&bsize, NULL) and then pass it as the comparison function to qsort.

You could of course make the comparison behave more naturally in the case of buffcmp(NULL, NULL) etc by adding more if statements.

半﹌身腐败 2024-07-20 23:04:20

您可以使用仿函数(将长度赋予仿函数的构造函数)或 Boost.Lambda(就地使用长度)。

You could functors (give the length to the functor's constructor) or Boost.Lambda (use the length in-place).

天暗了我发光 2024-07-20 23:04:20

我不清楚你在问什么。 但我会尝试,假设

  • 您有一个缓冲区
  • 您有一个某种类型的指针数组,该数组已经以某种方式处理,以便其部分或全部内容指向缓冲区

这相当于代码:

char *buf = (char*)malloc(sizeof(char)*bufsize);
for (int i=0; i<bufsize; ++i){
    buf[i] = some_cleverly_chosen_value(i);
}

char *ary[arraysize] = {0};
for(int i=0; i<arraysize; ++i){
   ary[i] = buf + some_clever_function(i);
}

/* ...do the sort here */

现在如果您控制缓冲区的分配,您可以

char *buf = (char*)malloc(sizeof(char)*(bufsize+1));
buf[bufsize]='\0';

使用 strcmp 替换并继续。 即使您不控制缓冲区的填充,这也是可能的。

如果您必须忍受其他人交给您的缓冲区,您可以

  1. 使用一些全局存储(您要求避免并进行良好的思考)。
  2. 为排序函数提供比原始指针(支持额外数据的结构或类的地址)更复杂的东西。 为此,您需要控制上面代码中ary的定义。
  3. 使用支持额外输入的排序函数。 Adam 建议的 sort_r 或家庭滚动解决方案(我确实推荐将其作为学生的练习,但在现实生活中不推荐)。 无论哪种情况,额外的数据都可能是指向缓冲区末尾的指针。

I'm not clear on what you're asking. But I'll try, assuming that

  • You have a single buffer
  • You have an array of pointers of some kind which has been processed in some way so that some or all of its contents point into the buffer

That is code equivalent to:

char *buf = (char*)malloc(sizeof(char)*bufsize);
for (int i=0; i<bufsize; ++i){
    buf[i] = some_cleverly_chosen_value(i);
}

char *ary[arraysize] = {0};
for(int i=0; i<arraysize; ++i){
   ary[i] = buf + some_clever_function(i);
}

/* ...do the sort here */

Now if you control the allocation of the buffer, you could substitute

char *buf = (char*)malloc(sizeof(char)*(bufsize+1));
buf[bufsize]='\0';

and go ahead using strcmp. This may be possible even if you don't control the filling of the buffer.

If you have to live with a buffer handed you by someone else you can

  1. Use some global storage (which you asked to avoid and good thinking).
  2. Hand the sort function something more complicated than a raw pointer (the address of a struct or class that supports the extra data). For this you need to control the deffinition of ary in the above code.
  3. Use a sort function which supports an extra input. Either sort_r as suggested by Adam, or a home-rolled solution (which I do recommend as an exercise for the student, and don't recommend in real life). In either case the extra data is probably a pointer to the end of the buffer.
才能让你更想念 2024-07-20 23:04:20

memcmp 应该在第一个不相等的字节处停止,因此长度应该很大,即到缓冲区的末尾。 那么它返回零的唯一方法是它确实到达缓冲区的末尾。

(顺便说一句,我自己倾向于合并排序。它稳定且表现良好。)

memcmp should stop on the first byte that is unequal, so the length should be large, i.e. to-the-end-of-the-buffer. Then the only way it can return zero is if it does go to the end of the buffer.

(BTW, I lean toward merge sort myself. It's stable and well-behaved.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文