星号在指针声明中的位置
我最近决定我只需要最终学习 C/C++,并且有一件事我并不真正理解指针,或者更准确地说,它们的定义。
这些例子怎么样:
int* test;
int
- *test;
int * test;
int* test,test2;
>int *test,test2;
int * test,test2;
现在,据我了解,前三种情况都在做同样的事情:Test 不是 int,而是一个指向一的指针。
第二组示例有点棘手。 在情况 4 中,test 和 test2 都是指向 int 的指针,而在情况 5 中,只有 test 是指针,而 test2 是“真正的”int。 那么案例6呢? 和案例5一样吗?
I've recently decided that I just have to finally learn C/C++, and there is one thing I do not really understand about pointers or more precisely, their definition.
How about these examples:
int* test;
int *test;
int * test;
int* test,test2;
int *test,test2;
int * test,test2;
Now, to my understanding, the first three cases are all doing the same: Test is not an int, but a pointer to one.
The second set of examples is a bit more tricky. In case 4, both test and test2 will be pointers to an int, whereas in case 5, only test is a pointer, whereas test2 is a "real" int. What about case 6? Same as case 5?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(15)
4、5、6是一样的,只是test是一个指针。 如果你想要两个指针,你应该使用:
或者,甚至更好(让一切变得清晰):
4, 5, and 6 are the same thing, only test is a pointer. If you want two pointers, you should use:
Or, even better (to make everything clear):
星号周围的空白没有意义。 这三个含义相同:
“
int *var1, var2
”是一种邪恶的语法,其目的只是为了迷惑人们,应该避免。 它扩展到:White space around asterisks have no significance. All three mean the same thing:
The "
int *var1, var2
" is an evil syntax that is just meant to confuse people and should be avoided. It expands to:许多编码指南建议您每行仅声明一个变量。 这可以避免您在提出这个问题之前遇到的任何混乱。 与我共事过的大多数 C++ 程序员似乎都坚持这一点。
我知道一点旁白,但我发现有用的是向后阅读声明。
这开始工作得很好,特别是当你开始声明 const 指针时,要知道指针是否是 const 或指针所指向的东西是否是 const 就变得很棘手。
Many coding guidelines recommend that you only declare one variable per line. This avoids any confusion of the sort you had before asking this question. Most C++ programmers I've worked with seem to stick to this.
A bit of an aside I know, but something I found useful is to read declarations backwards.
This starts to work very well, especially when you start declaring const pointers and it gets tricky to know whether it's the pointer that's const, or whether its the thing the pointer is pointing at that is const.
这个难题由三部分组成。
第一点是,C 和 C++ 中的空格通常除了分隔相邻的标记(否则无法区分)之外并不重要。
在预处理阶段,源文本被分解为一系列标记 - 标识符、标点符号、数字文字、字符串文字等。稍后会分析该标记序列的语法和含义。 标记生成器是“贪婪的”,并且将构建尽可能长的有效标记。 如果您编写类似
标记器的内容,则只会看到两个标记 - 标识符
inttest
后跟标点符号;
。 在此阶段,它不会将int
识别为单独的关键字(这将在该过程的稍后部分发生)。 因此,为了将行读取为名为test
的整数声明,我们必须使用空格来分隔标识符标记:*
字符不是任何字符的一部分标识符; 它本身就是一个单独的标记(标点符号)。 因此,如果您编写,编译器会看到 4 个单独的标记 -
int
、*
、test
和;
。 因此,空格在指针声明中并不重要,并且所有空格都以相同的方式解释。
难题的第二部分是声明在 C 和 C++ 中实际如何工作。 声明分为两个主要部分 - 一系列声明说明符(存储类说明符、类型说明符、类型限定符等),后跟一个逗号分隔的列表(可能已初始化)声明者。 在声明中,
声明说明符为
unsigned long int
,声明符为a[10]={0}
、*p=NULL
和f(void)
。 声明符引入了所声明事物的名称(a
、p
和f
)以及有关该事物的数组性、指针的信息性和功能性。 声明符还可以具有关联的初始值设定项。a
的类型是“unsigned long int
的 10 元素数组”。 该类型由声明说明符和声明符的组合完全指定,并且初始值由初始化器={0}
指定。 类似地,p
的类型是“指向unsigned long int
”的指针,并且该类型再次由声明说明符和声明符的组合指定,并初始化为NULL
。 同样的道理,f
的类型是“返回unsigned long int
的函数”。这是关键 - 没有“指针”类型说明符,就像没有“数组”类型说明符一样,就像没有“函数返回”类型说明符一样。 我们不能将数组声明为 as,
因为
[]
运算符的操作数是a
,而不是int
。 同样,在声明中,*
的操作数是p
,而不是int
。 但是因为间接运算符是一元的并且空格不重要,所以如果我们这样写,编译器不会抱怨。 但是,它总是被解释为int (*p);
。因此,如果您编写
*
的操作数是p
,那么它将被解释为因此,所有的都
做同样的事情 - 在所有三种情况下,
test1
是*
的操作数,因此具有“指向int
的指针”类型,而test2
具有int
类型代码>.声明符可以变得任意复杂。 您可以拥有指针数组:
您可以拥有指向数组的指针:
您可以拥有返回指针
的函数:您可以拥有指向函数的指针:
您可以拥有指向函数的指针数组:
您可以拥有返回指向数组的指针的函数:
您可以拥有函数返回指向函数的指针数组的指针,返回指向
T
: 的指针,然后就得到了
signal
: ,它读作是
,这仅仅触及了可能的表面。 但请注意,数组性质、指针性质和函数性质始终是声明符的一部分,而不是类型说明符的一部分。
需要注意的一件事 -
const
可以修改指针类型和指向的类型:以上两者都将
p
声明为指向const 的指针int
对象。 您可以向p
写入新值,将其设置为指向不同的对象:但您无法写入指向的对象:
但是,
将
p
声明为const
指向非 constint
的指针; 你可以写入p
指向的东西,但不能将
p
设置为指向不同的对象:这给我们带来了难题的第三部分 - 为什么声明都是这样构造的。
目的是声明的结构应该密切反映代码中表达式的结构(“声明模仿使用”)。 例如,假设我们有一个名为
ap
的指向int
的指针数组,并且我们想要访问int
值指向的int
值。 code>i第一个元素。 我们将按如下方式访问该值:表达式
*ap[i]
的类型为int
; 因此,ap
的声明可写为声明符
*ap[N]
与表达式*ap[i]
具有相同的结构。 运算符*
和[]
在声明中的行为方式与它们在表达式中的行为方式相同 -[]
的优先级高于一元>*
,因此*
的操作数为ap[N]
(解析为*(ap[N])
) 。再举一个例子,假设我们有一个指向名为
pa
的int
数组的指针,并且我们想要访问第i
个元素的值。 我们将其写为表达式
(*pa)[i]
的类型为int
,因此声明写为再次,优先级和结合性的规则相同申请。 在本例中,我们不想取消引用
pa
的第i
个元素,我们想要访问pa
指向,因此我们必须显式地将*
运算符与pa
分组。*
、[]
和()
运算符都是代码中表达式的一部分,因此它们是声明中声明符的所有部分。 声明符告诉您如何在表达式中使用对象。 如果您有一个类似int *p;
的声明,它会告诉您代码中的表达式*p
将产生一个int
值。 通过扩展,它告诉您表达式p
生成“指向int
的指针”或int *
类型的值。那么,像强制转换和
sizeof
表达式这样的东西呢?我们使用(int *)
或sizeof (int [10])
或像这样的东西? 我如何阅读类似There's no declarator, isn't the
*
和[]
运算符直接修改类型的内容?嗯,不 - 仍然有一个声明符,只是带有一个空标识符(称为抽象声明符)。 如果我们用符号 λ 表示一个空标识符,那么我们可以将这些内容读作
(int *λ)
、sizeof (int λ[10])
,并且它们行为与任何其他声明完全相同。
int *[10]
表示一个包含 10 个指针的数组,而int (*)[10]
表示一个指向数组的指针。现在是这个答案的固执己见的部分。 我不喜欢将简单指针声明为的 C++ 约定,
并认为这是不好的做法,原因如下:
T* p, q;
的含义的问题,那些问题的所有重复项, ETC。);T* a[N]
与使用不对称(除非您习惯于编写* a[i]
);T* p
约定,这...不);最后,它只是表明对两种语言的类型系统如何工作的困惑。
有充分的理由单独申报物品; 解决不良做法 (
T* p, q;
) 不是其中之一。 如果您正确地编写声明符(T *p, q;
),则不太可能引起混乱。我认为这类似于故意将所有简单的
for
循环编写为语法有效,但令人困惑,并且意图可能会被误解。 然而,
T* p;
约定在 C++ 社区中根深蒂固,我在自己的 C++ 代码中使用它,因为代码库之间的一致性是一件好事,但每次我都感到痒痒的。做吧。¹ 我将使用 C 术语 - C++ 术语略有不同,但概念基本相同。
There are three pieces to this puzzle.
The first piece is that whitespace in C and C++ is normally not significant beyond separating adjacent tokens that are otherwise indistinguishable.
During the preprocessing stage, the source text is broken up into a sequence of tokens - identifiers, punctuators, numeric literals, string literals, etc. That sequence of tokens is later analyzed for syntax and meaning. The tokenizer is "greedy" and will build the longest valid token that's possible. If you write something like
the tokenizer only sees two tokens - the identifier
inttest
followed by the punctuator;
. It doesn't recognizeint
as a separate keyword at this stage (that happens later in the process). So, for the line to be read as a declaration of an integer namedtest
, we have to use whitespace to separate the identifier tokens:The
*
character is not part of any identifier; it's a separate token (punctuator) on its own. So if you writethe compiler sees 4 separate tokens -
int
,*
,test
, and;
. Thus, whitespace is not significant in pointer declarations, and all ofare interpreted the same way.
The second piece to the puzzle is how declarations actually work in C and C++¹. Declarations are broken up into two main pieces - a sequence of declaration specifiers (storage class specifiers, type specifiers, type qualifiers, etc.) followed by a comma-separated list of (possibly initialized) declarators. In the declaration
the declaration specifiers are
unsigned long int
and the declarators area[10]={0}
,*p=NULL
, andf(void)
. The declarator introduces the name of the thing being declared (a
,p
, andf
) along with information about that thing's array-ness, pointer-ness, and function-ness. A declarator may also have an associated initializer.The type of
a
is "10-element array ofunsigned long int
". That type is fully specified by the combination of the declaration specifiers and the declarator, and the initial value is specified with the initializer={0}
. Similarly, the type ofp
is "pointer tounsigned long int
", and again that type is specified by the combination of the declaration specifiers and the declarator, and is initialized toNULL
. And the type off
is "function returningunsigned long int
" by the same reasoning.This is key - there is no "pointer-to" type specifier, just like there is no "array-of" type specifier, just like there is no "function-returning" type specifier. We can't declare an array as
because the operand of the
[]
operator isa
, notint
. Similarly, in the declarationthe operand of
*
isp
, notint
. But because the indirection operator is unary and whitespace is not significant, the compiler won't complain if we write it this way. However, it is always interpreted asint (*p);
.Therefore, if you write
the operand of
*
isp
, so it will be interpreted asThus, all of
do the same thing - in all three cases,
test1
is the operand of*
and thus has type "pointer toint
", whiletest2
has typeint
.Declarators can get arbitrarily complex. You can have arrays of pointers:
you can have pointers to arrays:
you can have functions returning pointers:
you can have pointers to functions:
you can have arrays of pointers to functions:
you can have functions returning pointers to arrays:
you can have functions returning pointers to arrays of pointers to functions returning pointers to
T
:and then you have
signal
:which reads as
and this just barely scratches the surface of what's possible. But notice that array-ness, pointer-ness, and function-ness are always part of the declarator, not the type specifier.
One thing to watch out for -
const
can modify both the pointer type and the pointed-to type:Both of the above declare
p
as a pointer to aconst int
object. You can write a new value top
setting it to point to a different object:but you cannot write to the pointed-to object:
However,
declares
p
as aconst
pointer to a non-constint
; you can write to the thingp
points tobut you can't set
p
to point to a different object:Which brings us to the third piece of the puzzle - why declarations are structured this way.
The intent is that the structure of a declaration should closely mirror the structure of an expression in the code ("declaration mimics use"). For example, let's suppose we have an array of pointers to
int
namedap
, and we want to access theint
value pointed to by thei
'th element. We would access that value as follows:The expression
*ap[i]
has typeint
; thus, the declaration ofap
is written asThe declarator
*ap[N]
has the same structure as the expression*ap[i]
. The operators*
and[]
behave the same way in a declaration that they do in an expression -[]
has higher precedence than unary*
, so the operand of*
isap[N]
(it's parsed as*(ap[N])
).As another example, suppose we have a pointer to an array of
int
namedpa
and we want to access the value of thei
'th element. We'd write that asThe type of the expression
(*pa)[i]
isint
, so the declaration is written asAgain, the same rules of precedence and associativity apply. In this case, we don't want to dereference the
i
'th element ofpa
, we want to access thei
'th element of whatpa
points to, so we have to explicitly group the*
operator withpa
.The
*
,[]
and()
operators are all part of the expression in the code, so they are all part of the declarator in the declaration. The declarator tells you how to use the object in an expression. If you have a declaration likeint *p;
, that tells you that the expression*p
in your code will yield anint
value. By extension, it tells you that the expressionp
yields a value of type "pointer toint
", orint *
.So, what about things like cast and
sizeof
expressions, where we use things like(int *)
orsizeof (int [10])
or things like that? How do I read something likeThere's no declarator, aren't the
*
and[]
operators modifying the type directly?Well, no - there is still a declarator, just with an empty identifier (known as an abstract declarator). If we represent an empty identifier with the symbol λ, then we can read those things as
(int *λ)
,sizeof (int λ[10])
, andand they behave exactly like any other declaration.
int *[10]
represents an array of 10 pointers, whileint (*)[10]
represents a pointer to an array.And now the opinionated portion of this answer. I am not fond of the C++ convention of declaring simple pointers as
and consider it bad practice for the following reasons:
T* p, q;
, all the duplicates to those questions, etc.);T* a[N]
is asymmetrical with use (unless you're in the habit of writing* a[i]
);T* p
convention cleanly, which...no);In the end, it just indicates confused thinking about how the two languages' type systems work.
There are good reasons to declare items separately; working around a bad practice (
T* p, q;
) isn't one of them. If you write your declarators correctly (T *p, q;
) you are less likely to cause confusion.I consider it akin to deliberately writing all your simple
for
loops asSyntactically valid, but confusing, and the intent is likely to be misinterpreted. However, the
T* p;
convention is entrenched in the C++ community, and I use it in my own C++ code because consistency across the code base is a good thing, but it makes me itch every time I do it.¹ I will be using C terminology - the C++ terminology is a little different, but the concepts are largely the same.
使用“顺时针螺旋法则”来帮助解析C/C++声明;
此外,如果可能的话,声明应该在单独的语句中(绝大多数情况下都是如此)。
Use the "Clockwise Spiral Rule" to help parse C/C++ declarations;
Also, declarations should be in separate statements when possible (which is true the vast majority of times).
正如其他人提到的,4、5 和 6 是相同的。 人们经常使用这些示例来论证
*
属于变量而不是类型。 虽然这是一个风格问题,但对于是否应该这样思考和编写它存在一些争论:或者这样:
FWIW我属于第一个阵营,但其他人提出第二种形式的论点的原因是它(主要)解决了这个特定的问题:
这可能会产生误导; 您可以写
相反,如果您确实想要两个指针,
或者,就我个人而言,我建议将其保留为每行一个变量,那么您喜欢哪种样式并不重要。
As others mentioned, 4, 5, and 6 are the same. Often, people use these examples to make the argument that the
*
belongs with the variable instead of the type. While it's an issue of style, there is some debate as to whether you should think of and write it this way:or this way:
FWIW I'm in the first camp, but the reason others make the argument for the second form is that it (mostly) solves this particular problem:
which is potentially misleading; instead you would write either
or if you really want two pointers,
Personally, I say keep it to one variable per line, then it doesn't matter which style you prefer.
在 4、5 和 6 中,
test
始终是指针,而test2
不是指针。 空白在 C++ 中(几乎)从来不重要。In 4, 5 and 6,
test
is always a pointer andtest2
is not a pointer. White space is (almost) never significant in C++.C 中的基本原理是按照使用变量的方式声明变量。 例如
,
*a[42]
将是一个char
。a[42]
一个 char 指针。 因此a
是一个 char 指针数组。这是因为最初的编译器编写者希望对表达式和声明使用相同的解析器。 (对于语言设计选择来说这不是一个非常明智的理由)
The rationale in C is that you declare the variables the way you use them. For example
says that
*a[42]
will be achar
. Anda[42]
a char pointer. And thusa
is an array of char pointers.This because the original compiler writers wanted to use the same parser for expressions and declarations. (Not a very sensible reason for a langage design choice)
我想说,最初的约定是将星号放在指针名称一侧(
我们可以看到星星也在右侧。
您可以遵循相同的规则,但如果您将星星放在字体一侧,那就没什么大不了的了。
请记住,一致性很重要,因此无论您选择哪一边,星星都必须在同一侧。
I would say that the initial convention was to put the star on the pointer name side (right side of the declaration
in the c programming language by Dennis M. Ritchie the stars are on the right side of the declaration.
by looking at the linux source code at https://github.com/torvalds/linux/blob/master/init/main.c
we can see that the star is also on the right side.
You can follow the same rules, but it's not a big deal if you put stars on the type side.
Remember that consistency is important, so always but the star on the same side regardless of which side you have choose.
在我看来,答案是两者兼而有之,具体取决于具体情况。
一般来说,IMO,最好将星号放在指针名称旁边,而不是类型旁边。 比较例如:
为什么第二种情况不一致? 因为例如
int x,y;
声明了两个相同类型的变量,但该类型在声明中仅提及一次。 这创造了先例和预期的行为。 而int*pointer1,pointer2;
与此不一致,因为它声明了pointer1
为指针,但pointer2
是一个整型变量。 显然容易出错,因此应该避免(通过将星号放在指针名称旁边,而不是类型旁边)。但是,有一些例外,在这些情况下,您可能无法将星号放在对象名称旁边(以及重要的位置),而不会得到不希望的结果 -例如:
MyClass *volatile MyObjName
void test (const char *const p) // const value pointing to a const point
最后,在某些情况下,可能是有争议的更清晰将星号放在类型名称旁边,例如:
void* ClassName::getItemPtr () {return &item;} // 首先清除视线
In my opinion, the answer is BOTH, depending on the situation.
Generally, IMO, it is better to put the asterisk next to the pointer name, rather than the type. Compare e.g.:
Why is the second case inconsistent? Because e.g.
int x,y;
declares two variables of the same type but the type is mentioned only once in the declaration. This creates a precedent and expected behavior. Andint* pointer1, pointer2;
is inconsistent with that because it declarespointer1
as a pointer, butpointer2
is an integer variable. Clearly prone to errors and, thus, should be avoided (by putting the asterisk next to the pointer name, rather than the type).However, there are some exceptions where you might not be able to put the asterisk next to an object name (and where it matters where you put it) without getting undesired outcome — for example:
MyClass *volatile MyObjName
void test (const char *const p) // const value pointed to by a const pointer
Finally, in some cases, it might be arguably clearer to put the asterisk next to the type name, e.g.:
void* ClassName::getItemPtr () {return &item;} // Clear at first sight
这更多的是@John Bode 答案的附录,这是一篇漂亮的文章。
正如 Bode 所提到的,当前 C 语言中关于指针声明中一元运算符
*
的放置的大部分混乱都源于 C++。Jens Gustedt 的《Modern C》中的以下段落最好地说明了这一点(请记住,G. 是 ISO C 标准的联合编辑):
这是对 K&R 的曲解,K&R 指出在指针声明中使用
*
的目的是“助记符”,但当人们意识到时会变得更容易理解M. Gustedt 拥有 C++ 背景。This is more of an addendum to @John Bode’s answer, which is a beautiful piece of writing.
As Bode has alluded to, much of the current confusion in C over the placement of the unary operator
*
in a pointer declaration has a C++ origin.It is best illustrated by the following paragraph from Jens Gustedt’s Modern C (remember G. is a co-editor of the ISO C Standard):
This is a perversion of K&R, who stated that the use of
*
in a pointer declaration ‘is intended as a mnemonic’, but becomes easier to understand when one realises M. Gustedt has a background in C++.指针是类型的修饰符。 最好从右到左阅读它们,以便更好地理解星号如何修改类型。 'int *' 可以读作“指向 int 的指针”。在多个声明中,您必须指定每个变量都是指针,否则它将被创建为标准变量。1,2
和 3) Test 的类型为 (int *) . 空格无关紧要
4,5 和 6) Test 的类型为 (int *)。同样,空格也无关紧要。
The pointer is a modifier to the type. It's best to read them right to left in order to better understand how the asterisk modifies the type. 'int *' can be read as "pointer to int'. In multiple declarations you must specify that each variable is a pointer or it will be created as a standard variable.
1,2 and 3) Test is of type (int *). Whitespace doesn't matter.
4,5 and 6) Test is of type (int *). Test2 is of type int. Again whitespace is inconsequential.
我一直更喜欢这样声明指针:
我读到这个是说“
i
is of type int-pointer”。 如果每次声明只声明一个变量,则可以摆脱这种解释。然而,令人不安的事实是,这种解读是错误的。 《C 编程语言》第二版(第 94 页)解释了相反的范例,即 C 标准中使用的范例:
因此,根据 C 语言的推理,当您声明
没有声明两个
int*
类型的变量时,您将引入两个计算结果为int
类型的表达式,不附加内存中int
的分配。编译器非常乐意接受以下内容:
因为在 C 范例中,编译器只需要跟踪
*ip
和i< 的 type /代码>。 程序员应该了解
*ip
和i
的含义。 在这种情况下,ip
未初始化,因此程序员有责任在取消引用它之前将其指向有意义的内容。I have always preferred to declare pointers like this:
I read this to say "
i
is of type int-pointer". You can get away with this interpretation if you only declare one variable per declaration.It is an uncomfortable truth, however, that this reading is wrong. The C Programming Language, 2nd Ed. (p. 94) explains the opposite paradigm, which is the one used in the C standards:
So, by the reasoning of the C language, when you declare
you are not declaring two variables of type
int*
, you are introducing two expressions that evaluate to anint
type, with no attachment to the allocation of anint
in memory.A compiler is perfectly happy to accept the following:
because in the C paradigm, the compiler is only expected to keep track of the type of
*ip
andi
. The programmer is expected to keep track of the meaning of*ip
andi
. In this case,ip
is uninitialized, so it is the programmer's responsibility to point it at something meaningful before dereferencing it.一个好的经验法则是,很多人似乎通过以下方式掌握这些概念:在 C++ 中,许多语义是通过关键字或标识符的左绑定派生的。
举个例子:
const适用于“int”这个词。 指针的星号也是如此,它们适用于它们左边的关键字。 实际的变量名称是什么? 是的,这是由剩下的部分声明的。
A good rule of thumb, a lot of people seem to grasp these concepts by: In C++ a lot of semantic meaning is derived by the left-binding of keywords or identifiers.
Take for example:
The const applies to the "int" word. The same is with pointers' asterisks, they apply to the keyword left of them. And the actual variable name? Yup, that's declared by what's left of it.