基本的 C 风格字符串内存分配

发布于 2024-12-15 22:06:33 字数 584 浏览 2 评论 0原文

我正在开发一个项目,该项目的现有代码主要使用 C++,但使用 c 风格的字符串。采取以下措施:

#include <iostream>
int main(int argc, char *argv[])
{
    char* myString = "this is a test";
    myString = "this is a very very very very very very very very very very very long string";
    cout << myString << endl;
    return 0;
}

编译并运行良好,输出为长字符串。

但是我不明白为什么它有效。我的理解是,这

char* myString 

是一个指向足够大的内存区域的指针,可以容纳字符串文字“这是一个测试”。如果是这样的话,那么我如何才能在同一位置存储更长的字符串呢?我预计这样做时它会崩溃,因为试图将一根长绳子塞进为短绳子留出的空间中。

显然,对这里发生的事情存在基本的误解,因此我感谢任何帮助理解这一点的帮助。

I am working on a project with existing code which uses mainly C++ but with c-style strings. Take the following:

#include <iostream>
int main(int argc, char *argv[])
{
    char* myString = "this is a test";
    myString = "this is a very very very very very very very very very very very long string";
    cout << myString << endl;
    return 0;
}

This compiles and runs fine with the output being the long string.

However I don't understand WHY it works. My understanding is that

char* myString 

is a pointer to an area of memory big enough to hold the string literal "this is a test". If that's the case, then how am I able to then store a much longer string in the same location? I expected it to crash when doing this due to trying to cram a long string into a space set aside for the shorter one.

Obviously there's a basic misunderstanding of what's going on here so I appreciate any help understanding this.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

巷雨优美回忆 2024-12-22 22:06:33

您不是在更改内存的内容,而是在更改指针的值以指向不同的内存区域,该区域保存“这是一个非常非常非常非常非常非常非常非常非常长的字符串”

请注意,char* myString 仅为指针分配足够的字节(通常为 4 或 8 字节)。当您执行 char* myString = "this is a test"; 时,实际发生的情况是,在程序启动之前,编译器在可执行文件中分配了空间图像并将“这是一个测试”放入该内存中。然后,当您执行 char* myString = "this is a test"; 时,它实际上所做的只是为指针分配足够的字节,并使指针指向它在编译时已经分配的内存,在可执行文件中。

所以如果你喜欢图表:

char* myString = "this is a test";

(allocate memory for myString)

              ---> "this is a test"
            / 
myString---

                   "this is a very very very very very very very very very very very long string"

那么

myString = "this is a very very very very very very very very very very very long string";

                   "this is a test"

myString---
            \
              ---> "this is a very very very very very very very very very very very long string"

You're not changing the content of the memory, you're changing the value of the pointer to point to a different area of memory which holds "this is a very very very very very very very very very very very long string".

Note that char* myString only allocates enough bytes for the pointer (usually 4 or 8 bytes). When you do char* myString = "this is a test";, what actually happened was that before your program even started, the compiler allocated space in the executable image and put "this is a test" in that memory. Then when you do char* myString = "this is a test"; what it actually does is just allocate enough bytes for the pointer, and make the pointer point to that memory it had already allocated at compile time, in the executable.

So if you like diagrams:

char* myString = "this is a test";

(allocate memory for myString)

              ---> "this is a test"
            / 
myString---

                   "this is a very very very very very very very very very very very long string"

Then

myString = "this is a very very very very very very very very very very very long string";

                   "this is a test"

myString---
            \
              ---> "this is a very very very very very very very very very very very long string"
生寂 2024-12-22 22:06:33

内存中有两个字符串。首先是“这是一个测试”,假设它从地址 0x1000 开始。第二个是“这是一个非常非常...测试”,它从地址 0x1200 开始。

通过

char* myString = "this is a test";

创建一个名为 myString 的变量并为其分配地址 0x1000。然后,由

myString = "this is a very very ... test";

您分配0x1200。 。

cout << myString << endl;

只需打印从 0x1200 开始的字符串即可

There are two strings in the memory. First is "this is a test" and lets say it begins at the address 0x1000. The second is "this is a very very ... test" and it begins at the address 0x1200.

By

char* myString = "this is a test";

you crate a variable called myString and assign address 0x1000 to it. Then, by

myString = "this is a very very ... test";

you assign 0x1200. By

cout << myString << endl;

you just print the string beginning at 0x1200.

甜是你 2024-12-22 22:06:33

您有两个 const char[n] 类型的字符串文字。这些可以分配给 char* 类型的变量,该变量只不过是指向 char 的指针。每当你声明一个指向 T 的指针类型的变量时,你只是声明了指针,而不是它指向的内存。

编译器为这两个文字保留内存,您只需获取指针变量并将其一个接一个地指向这些文字即可。字符串文字是只读的,它们的分配由编译器负责。通常,它们存储在受保护的只读存储器中的可执行映像中。字符串文字的生命周期通常与程序本身的生命周期相同。

现在,如果您尝试修改文字的内容,但您没有这样做,那将是 UB。为了防止您尝试错误地进行修改,您最好将变量声明为 const char*

You have two string literals of type const char[n]. These can be assigned to a variable of type char*, which is nothing more than a pointer to a char. Whenever you declare a variable of type pointer-to-T you are only declaring the pointer, and not the memory to which it points.

The compiler reserves memory for both literals and you just take your pointer variable and point it at those literals one after the other. String literals are read-only and their allocation is taken care of by the compiler. Typically they are stored in the executable image in protected read-only memory. A string literal typically has a lifetime equal to that of the program itself.

Now, it would be UB if you attempted to modify the contents of a literal, but you don't. To help prevent yourself from attempting modifications in error you would be wise to declare your variable as const char*.

牵你的手,一向走下去 2024-12-22 22:06:33

在程序执行期间,分配包含“这是一个测试”的内存块,并将该内存块中的第一个字符的地址分配给 myString 变量。在下一行中,分配了一个单独的内存块,其中包含“这是一个非常非常...”,并且该内存块中的第一个字符的地址现在被分配给 myString 变量,替换它以前使用的地址将新地址存储到“非常非常长”的字符串中。

仅出于说明目的,假设第一个内存块如下所示:

[t][h][i][s][ ][i][s][ ][a][ ][t][e][s ][t]
假设这个字符序列/数组中第一个“t”字符的地址是 0x100。
所以在第一次给myString变量赋值后,myString变量包含地址0x100,它指向“this is a test”的第一个字母。

那么,完全不同的内存块包含:

[t][h][i][s][ ][i][s][ ][a][ ][v][e][r][r][是]...
假设第一个“t”字符的地址是 0x200。
因此,在对 myString 变量进行第二次赋值之后,myString 变量现在包含地址 0x200,它指向“this is a very very very...”的第一个字母。

由于 myString 只是一个指向字符的指针(因此:“char *”是它的类型),因此它只存储字符的地址;它不关心数组应该有多大,它甚至不知道它指向一个“数组”,只知道它存储一个字符的地址......

例如,你可以合法地这样做this:

    char myChar = 'C';
/* assign the address of the location in 
   memory in which 'C' is stored to 
   the myString variable. */
    myString = &myChar; 

希望这已经足够清楚了。如果是这样,请投票/接受答案。如果没有,请发表评论,以便我澄清。

During program execution, a block of memory containing "this is a test" is allocated, and the address of the first character in that block of memory is assigned to the myString variable. In the next line, a separate block of memory containing "this is a very very..." is allocated, and the address of the first character in that block of memory is now assigned to the myString variable, replacing the address it used to store with the new address to the "very very long" string.

just for illustration, let's say the first block of memory looks like this:

[t][h][i][s][ ][i][s][ ][a][ ][t][e][s][t]
and let's just say the address of this first 't' character in this sequence/array of characters is 0x100.
so after the first assignment of the myString variable, the myString variable contains the address 0x100, which points to the first letter of "this is a test".

then, a totally different block of memory contains:

[t][h][i][s][ ][i][s][ ][a][ ][v][e][r][r][y]...
and let's just say that the address of this first 't' character is 0x200.
so after the second assignment of the myString variable, the myString variable NOW contains the address 0x200, which points to the first letter of "this is a very very very...".

Since myString is just a pointer to a character (hence: "char *" is it's type), it only stores the address of a character; it has no concern for how big the array is supposed to be, it doesn't even know that it is pointing to an "array", only that it is storing the address of a character...

for example, you could legally do this:

    char myChar = 'C';
/* assign the address of the location in 
   memory in which 'C' is stored to 
   the myString variable. */
    myString = &myChar; 

Hopefully that was clear enough. If so, upvote/accept answer. If not, please comment so that I may clarify.

深空失忆 2024-12-22 22:06:33

字符串文字不需要分配 - 它们按原样存储并且可以直接使用。本质上 myString 是一个指向一个字符串文字的指针,并被更改为指向另一个字符串文字。

string literals do not require allocation - they are stored as-is and can be used directly. Essentially myString was a pointer to one string literal, and was changed to point to another string literal.

巾帼英雄 2024-12-22 22:06:33

char* 表示指向保存字符的内存块的指针。

C 风格的字符串函数获取指向字符串开头的指针。他们假设存在一个以 0 空字符 (\n) 结尾的字符序列。

那么<<运算符实际上所做的是从第一个字符位置循环,直到找到空字符。

char* means a pointer to a block of memory that holds a character.

C style string functions get a pointer to the start of a string. They assume there's a sequence of characters that end with a 0-null character (\n).

So what the << operator actually does is loop from that first character position until it finds a null character.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文