确定字符串文字的长度

发布于 2025-01-26 03:47:17 字数 359 浏览 2 评论 0原文

给定一系列指针到字符串文字:

char *textMessages[] = {
    "Small text message",
    "Slightly larger text message",
    "A really large text message that "
    "is spread over multiple lines"
}

如何确定特定字符串文字的长度 - 说第三个?我尝试使用如下使用sizeof命令:

int size = sizeof(textMessages[2]);

但是结果似乎是数组中的指针数,而不是字符串文字的长度。

Given an array of pointers to string literals:

char *textMessages[] = {
    "Small text message",
    "Slightly larger text message",
    "A really large text message that "
    "is spread over multiple lines"
}

How does one determine the length of a particular string literal - say the third one? I have tried using the sizeof command as follows:

int size = sizeof(textMessages[2]);

But the result seems to be the number of pointers in the array, rather than the length of the string literal.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

殊姿 2025-02-02 03:47:18

您可以利用这一事实,即数组中的值是连续的:

const char *messages[] = {
    "footer",
    "barter",
    "banger"
};

size_t sizeOfMessage1 = (messages[1] - messages[0]) / sizeof(char); // 7   (6 chars + '\0')

大小是通过使用元素边界来确定的。第二个元素的开始和开始之间的空间是第一个大小。

这包括终端\ 0。当然,解决方案只能与恒定的字符串正常工作。如果字符串将是指针,您将获得指针的大小,而不是字符串的长度。

这不能保证工作。如果将字段对齐,则可能会产生错误的大小,并且编译器可能会引入其他警告,例如合并相同的字符串。
另外,您的数组中至少需要两个元素。

You could exploit the fact, that values in an array are consecutive:

const char *messages[] = {
    "footer",
    "barter",
    "banger"
};

size_t sizeOfMessage1 = (messages[1] - messages[0]) / sizeof(char); // 7   (6 chars + '\0')

The size is determined by using the boundaries of the elements. The space between the beginning of the first and beginning of the second element is the size of the first.

This includes the terminating \0. The solution, of course, does only work properly with constant strings. If the strings would've been pointers, you would get the size of a pointer instead the length of the string.

This is not guaranteed to work. If the fields are aligned, this may yield wrong sizes and there may be other caveats introduced by the compiler, like merging identical strings.
Also you'll need at least two elements in your array.

千仐 2025-02-02 03:47:18

strlen在运行时缓慢且有可能执行。而sizeof(“ string_literal”) - 1是快速且在编译时执行的。问题是如何在您的指针数组指向的字符串文字上使用sizeof - 我们不能。

现在,假设您希望尽可能快地出于绩效原因而在编译时进行此操作……如果您在问题上投掷足够的丑陋宏,则C中的所有内容都是可能的。这是这样的解决方案,以凭借可读性为代价,有利于性能和可维护性。

我们可以将字符串初始化器列表从数组中移出并进入宏。例如,通过将所谓的“ X-Macros”声明:

#define STRING_LIST(X)                     \
  X("Small text message")                  \
  X("Slightly larger text message")        \
  X("A really large text message that "    \
    "is spread over multiple lines")

现在可以通过定义另一个宏并将其作为参数“ X”将其传递到上述列表中,以便为各种目的重复使用。例如,数组声明可以做:

#define STRING_INIT_LIST(str) str,
char *textMessages[] = 
{
  STRING_LIST(STRING_INIT_LIST)    
};

如果我们想要一个1-1对应的查找表,其中包含每个字符串的尺寸:

#define STRING_SIZES(str) (sizeof(str)-1),
const size_t sizes[] = 
{
  STRING_LIST(STRING_SIZES)
};

完整的示例,包含查找表版本以及直接编译的时间处理版本:

#include <stdio.h>

#define STRING_LIST(X)                     \
  X("Small text message")                  \
  X("Slightly larger text message")        \
  X("A really large text message that "    \
    "is spread over multiple lines")

int main (void)
{
  #define STRING_INIT_LIST(str) str,
  char *textMessages[] = 
  {
    STRING_LIST(STRING_INIT_LIST)    
  };
  
  #define STRING_SIZES(str) (sizeof(str)-1),
  const size_t sizes[] = 
  {
    STRING_LIST(STRING_SIZES)
  };

  puts("The strings are:");
  #define STRING_PRINT(str) printf(str ", size:%zu\n", sizeof(str)-1);
  STRING_LIST(STRING_PRINT)

  printf("\nOr if you will:\n");
  for(size_t i=0; i<sizeof(textMessages)/sizeof(*textMessages); i++)
  {
    printf("%s, size:%zu\n", textMessages[i], sizes[i]);
  }
}

输出:

The strings are:
Small text message, size:18
Slightly larger text message, size:28
A really large text message that is spread over multiple lines, size:62

Or if you will:
Small text message, size:18
Slightly larger text message, size:28
A really large text message that is spread over multiple lines, size:62

此机器代码归结为从内存中打印一堆字符串和常数,没有开销strlen呼叫。

strlen is slow and potentially executed in run-time. Whereas sizeof("string_literal") - 1 is fast and executed at compile-time. The problem is how to use sizeof on string literals pointed at by your pointer array - we can't.

Now assuming you want this as fast as possible and also done at compile-time for performance reasons... Everything in C is possible if you throw enough ugly macros at the problem. Here's such a solution that favours performance and maintainability at the cost of readability.

We can move the string initializer list out of the array and into a macro. For example by declaring so-called "X-macros", like this:

#define STRING_LIST(X)                     \
  X("Small text message")                  \
  X("Slightly larger text message")        \
  X("A really large text message that "    \
    "is spread over multiple lines")

This macro can now be reused for various purposes, by defining another macro and passing it as parameter "X" to the above list. For example the array declaration could be done as:

#define STRING_INIT_LIST(str) str,
char *textMessages[] = 
{
  STRING_LIST(STRING_INIT_LIST)    
};

And if we want a 1-to-1 corresponding look-up table containing the sizes of each string:

#define STRING_SIZES(str) (sizeof(str)-1),
const size_t sizes[] = 
{
  STRING_LIST(STRING_SIZES)
};

Complete example containing both a look-up table version as well as a directly compile-time processing version:

#include <stdio.h>

#define STRING_LIST(X)                     \
  X("Small text message")                  \
  X("Slightly larger text message")        \
  X("A really large text message that "    \
    "is spread over multiple lines")

int main (void)
{
  #define STRING_INIT_LIST(str) str,
  char *textMessages[] = 
  {
    STRING_LIST(STRING_INIT_LIST)    
  };
  
  #define STRING_SIZES(str) (sizeof(str)-1),
  const size_t sizes[] = 
  {
    STRING_LIST(STRING_SIZES)
  };

  puts("The strings are:");
  #define STRING_PRINT(str) printf(str ", size:%zu\n", sizeof(str)-1);
  STRING_LIST(STRING_PRINT)

  printf("\nOr if you will:\n");
  for(size_t i=0; i<sizeof(textMessages)/sizeof(*textMessages); i++)
  {
    printf("%s, size:%zu\n", textMessages[i], sizes[i]);
  }
}

Output:

The strings are:
Small text message, size:18
Slightly larger text message, size:28
A really large text message that is spread over multiple lines, size:62

Or if you will:
Small text message, size:18
Slightly larger text message, size:28
A really large text message that is spread over multiple lines, size:62

The machine code of this boils down to printing a bunch of strings and constants from memory, no overhead strlen calls at all.

指尖微凉心微凉 2025-02-02 03:47:18

strlen 也许?

size_t size = strlen(textMessages[2]);

strlen maybe?

size_t size = strlen(textMessages[2]);
回首观望 2025-02-02 03:47:18

您应该使用strlen()库方法获取字符串的长度。 sizeof将为您提供textMessages [2]的大小,该指针是机器依赖性的(4个字节或8个字节)。

You should use the strlen() library method to get the length of a string. sizeof will give you the size of textMessages[2], a pointer, which would be machine dependent (4 bytes or 8 bytes).

烛影斜 2025-02-02 03:47:17

如果要在 compile time 中计算的数字(与strlen 相反),使用像

sizeof "A really large text message that "
       "is spread over multiple lines";

您可能想使用宏来避免的 表达式完全可以但是,重复长字面文字:

#define LONGLITERAL "A really large text message that " \
                    "is spread over multiple lines"

请注意,sizeof返回的值包括终止的nul,因此比strlen多。

If you want the number computed at compile time (as opposed to at runtime with strlen) it is perfectly okay to use an expression like

sizeof "A really large text message that "
       "is spread over multiple lines";

You might want to use a macro to avoid repeating the long literal, though:

#define LONGLITERAL "A really large text message that " \
                    "is spread over multiple lines"

Note that the value returned by sizeof includes the terminating NUL, so is one more than strlen.

紫轩蝶泪 2025-02-02 03:47:17

我的建议是使用 strlen 并打开编译器优化。

例如,使用X86上的GCC 4.7:

#include <string.h>
static const char *textMessages[3] = {
    "Small text message",
    "Slightly larger text message",
    "A really large text message that "
    "is spread over multiple lines"
};

size_t longmessagelen(void)
{
  return strlen(textMessages[2]);
}

在运行Make Cflags =“ -ggdb -o3” example.o

$ gdb example.o
(gdb) disassemble longmessagelen
   0x00000000 <+0>: mov    $0x3e,%eax
   0x00000005 <+5>: ret

IE编译器已将调用替换为strlen,用常数替换了调用。值0x3e = 62。

不要浪费时间进行编译器可以为您做的优化!

My suggestion would be to use strlen and turn on compiler optimizations.

For example, with gcc 4.7 on x86:

#include <string.h>
static const char *textMessages[3] = {
    "Small text message",
    "Slightly larger text message",
    "A really large text message that "
    "is spread over multiple lines"
};

size_t longmessagelen(void)
{
  return strlen(textMessages[2]);
}

After running make CFLAGS="-ggdb -O3" example.o:

$ gdb example.o
(gdb) disassemble longmessagelen
   0x00000000 <+0>: mov    $0x3e,%eax
   0x00000005 <+5>: ret

I.e. the compiler has replaced the call to strlen with the constant value 0x3e = 62.

Don't waste time performing optimizations that the compiler can do for you!

眼角的笑意。 2025-02-02 03:47:17

strlen 为您提供字符串的长度,而 sizeof 将返回您已输入的字节中数据类型的大小。

strlen> strlen

sizeof

strlen gives you the length of string whereas sizeof will return the size of the Data Type in Bytes you have entered as parameter.

strlen

sizeof

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文