判断接收到的指针是字符串、ushort 还是数组

发布于 2024-09-26 06:05:04 字数 419 浏览 0 评论 0原文

我在 C 中插入 memcpy() 函数,因为目标应用程序使用它来连接字符串,并且我想找出正在创建哪些字符串。代码是:

void * my_memcpy ( void * destination, const void * source, size_t num )
{
    void *ret = memcpy(destination, source, num);
    // printf ("[MEMCPY] = %s \n", ret);
    return ret;
}

函数被成功调用,但第一个参数可以是任何参数,我只想在结果是字符串或数组时跟踪它。我想问一下它是数组还是字符串。我知道这不能简单地完成:有没有办法找出 RET 指向的内容?

我在 MACOSX 下工作并与 DYLD 进行插入。

非常感谢。

I am interposing the memcpy() function in C because the target application uses it to concatenate strings and I want to find out which strings are being created. The code is:

void * my_memcpy ( void * destination, const void * source, size_t num )
{
    void *ret = memcpy(destination, source, num);
    // printf ("[MEMCPY] = %s \n", ret);
    return ret;
}

The function gets called succesfully but the first parameter can be whatever and I only want to trace it if the result is a string or array. I would have to ask if it is array or string. I know this can't be done straightforward: is there anyway to find out what RET points to?

I am working under MACOSX and interpositioning with DYLD.

Thank you very much.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

所谓喜欢 2024-10-03 06:05:04

由于 void* 表示原始内存块,因此无法确定其中的实际数据。

但是,您可以对每个操作进行“类似字符串”的内存转储,只需为结果输出提供某种“输出上限”即可。

这可以通过以下方式实现:

const size_t kUpperLimit = 32;

void output_memory_dump(void* memory) {
   std::cout.write(reinterpret_cast<char*>(memory), kUpperLimit);
}

对于非字符串之类的数据,输出将很难解释,但否则你会得到你正在搜索的内容。

你可以尝试应用一些基于猜测的方法,例如迭代 reinterpret_cast(memory) 并制作 is_alphanumeric && is_space 检查每个符号,但这种方法似乎不太稳定(谁知道 void* 中实际上可能存在什么...)。

无论如何,对于某些情况可能没问题。

As void* represents a raw block of memory, there is no way to determine what actual data lies there.

However, you can make a "string-like" memory dump on every operation, just give the resulting output some sort of the "upper output limit".

This could be implemented the following way:

const size_t kUpperLimit = 32;

void output_memory_dump(void* memory) {
   std::cout.write(reinterpret_cast<char*>(memory), kUpperLimit);
}

For non-string like data the output would be hardly interpretable, but otherwise you'd get what you were searching for.

You could attempt to apply some guess-based approach like iterating through reinterpret_cast<void*>(memory) and making is_alphanumeric && is_space checks to every symbol, but this approach doesn't seem very stable (who knows what could actually lie in that void*...).

Anyway, for some situations that might be fine.

何其悲哀 2024-10-03 06:05:04

您可以首先对复制的内存应用一些启发式方法,然后根据这些结果决定是否要打印它。

static int maybe_string(const void *data, size_t n) {
  const unsigned char *p;
  size_t i;

  p = data;
  for (i = 0; i < n; i++) {
    int c = p[i];
    if (c == '\n' || c == '\r' || c == '\t')
      continue;
    if (1 <= c && c < 32)
      return 0; /* unusual ASCII control character */
    if (c == '\0' && i > 5)
      return 1; /* null-terminated and more than a few characters long */
  }

  return 0; /* not null-terminated, so it isn't a string */
}

这种启发式并不完美。例如,以下模式会失败:

const char *str = "hello, world";
size_t len = strlen(str);
char *buf = malloc(1024);
memcpy(buf, str, len);
buf[len] = '\0';

如果您也想捕获该模式,则必须更改上述函数。

You can first apply some heuristics to the copied memory and based on that you can decide whether you want to print it.

static int maybe_string(const void *data, size_t n) {
  const unsigned char *p;
  size_t i;

  p = data;
  for (i = 0; i < n; i++) {
    int c = p[i];
    if (c == '\n' || c == '\r' || c == '\t')
      continue;
    if (1 <= c && c < 32)
      return 0; /* unusual ASCII control character */
    if (c == '\0' && i > 5)
      return 1; /* null-terminated and more than a few characters long */
  }

  return 0; /* not null-terminated, so it isn't a string */
}

This heuristic is not perfect. For example, it fails for the following pattern:

const char *str = "hello, world";
size_t len = strlen(str);
char *buf = malloc(1024);
memcpy(buf, str, len);
buf[len] = '\0';

If you want to catch that too, you will have to change the above function.

不羁少年 2024-10-03 06:05:04

ret 等于目标指针。但无法确定它是数组还是字符串,除非您了解有关数组或字符串的更多信息(例如,字符串具有一定长度并且以 null 结尾)。

ret is equal to the destination pointer. But it's not possible to determine whether it's an array or a string, unless you know more information about the array or string (for instance, that the string is of a certain length and is null-terminated).

毁我热情 2024-10-03 06:05:04

不,你无法从 void 类型的指针中找出这一点。另外,您不知道源或目的地的大小,因此启发式方法将不起作用。由于其他原因它也无法工作,例如,存储在 void* 指向的内存区域中的二进制数据实际上可以在末尾有零字节,但这并不意味着它是字符串。

No, you cannot figure this out from a pointer of void type. Plus, you don't know the size of source or destination, so the heuristic approach will not work. It will not work due to other reasons as well, for example, binary data stored in memory region pointed by void* can really have zero byte at the end, but that doesn't mean that it is string.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文