C 中的 size_t 是什么?
我对 C 中的 size_t
感到困惑。我知道它是由 sizeof
运算符返回的。但它到底是什么?它是一种数据类型吗?
假设我有一个 for
循环:
for(i = 0; i < some_size; i++)
我应该使用 int i;
还是 size_t i;
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(15)
来自维基百科:
这暗示着,
size_t
是一种保证保存任何数组索引的类型。From Wikipedia:
As an implication,
size_t
is a type guaranteed to hold any array index.size_t
是无符号类型。因此,它不能表示任何负值(<0)。当你计算某物并确定它不能为负数时,你会使用它。例如,strlen()
返回size_t
因为字符串的长度必须至少为 0。在您的示例中,如果循环索引始终大于 0,则使用
size_t
可能有意义,或者任何其他无符号数据类型。当您使用
size_t
对象时,您必须确保在使用它的所有上下文(包括算术)中,您都需要非负值。例如,假设您有:并且您想要找到
str2
和str1
的长度差异。你不能这样做:这是因为分配给
diff
的值始终是正数,即使s2
s2
s2
s1 时也是如此。 s1
,因为计算是使用无符号类型完成的。在这种情况下,根据您的用例,您可能最好对s1
和使用
。int
(或long long
) >s2C/POSIX 中有一些函数可以/应该使用
size_t
,但由于历史原因没有这样做。例如,fgets
的第二个参数理想情况下应为size_t
,但实际上是int
。size_t
is an unsigned type. So, it cannot represent any negative values(<0). You use it when you are counting something, and are sure that it cannot be negative. For example,strlen()
returns asize_t
because the length of a string has to be at least 0.In your example, if your loop index is going to be always greater than 0, it might make sense to use
size_t
, or any other unsigned data type.When you use a
size_t
object, you have to make sure that in all the contexts it is used, including arithmetic, you want non-negative values. For example, let's say you have:and you want to find the difference of the lengths of
str2
andstr1
. You cannot do:This is because the value assigned to
diff
is always going to be a positive number, even whens2 < s1
, because the calculation is done with unsigned types. In this case, depending upon what your use case is, you might be better off usingint
(orlong long
) fors1
ands2
.There are some functions in C/POSIX that could/should use
size_t
, but don't because of historical reasons. For example, the second parameter tofgets
should ideally besize_t
, but isint
.size_t
是一种可以保存任何数组索引的类型。根据实现的不同,它可以是以下任意一种:
unsigned char
unsigned short
unsigned int
unsigned long
unsigned long long
这是在我的机器的
stddef.h
中定义size_t
的方式:size_t
is a type that can hold any array index.Depending on the implementation, it can be any of:
unsigned char
unsigned short
unsigned int
unsigned long
unsigned long long
Here's how
size_t
is defined instddef.h
of my machine:如果你是经验型,
Ubuntu 14.04 64位GCC 4.8的输出:
注意,
stddef.h
是由GCC提供的,而不是src/gcc下的glibc GCC 4.2 中的 /ginclude/stddef.h
。有趣的 C99 外观
malloc
将size_t
作为参数,因此它确定可以分配的最大大小。由于它也是由
sizeof
返回的,我认为它限制了任何数组的最大大小。另请参阅:C 中数组的最大大小是多少?
If you are the empirical type,
Output for Ubuntu 14.04 64-bit GCC 4.8:
Note that
stddef.h
is provided by GCC and not glibc undersrc/gcc/ginclude/stddef.h
in GCC 4.2.Interesting C99 appearances
malloc
takessize_t
as an argument, so it determines the maximum size that may be allocated.And since it is also returned by
sizeof
, I think it limits the maximum size of any array.See also: What is the maximum size of an array in C?
要了解为什么
size_t
需要存在以及我们是如何实现这一点的:用实用术语来说,
size_t
和ptrdiff_t
保证在 64 位上64 位实现,32 位实现上的 32 位宽,等等。他们无法在不破坏遗留代码的情况下,在每个编译器上强制任何现有类型表示这一点。size_t
或ptrdiff_t
不一定与intptr_t
或uintptr_t
相同。它们在 20 世纪 80 年代末将size_t
和ptrdiff_t
添加到标准时仍在使用的某些架构上有所不同,并在 C99 添加了许多新类型但尚未消失(例如 16 位 Windows)。 16 位保护模式下的 x86 具有分段内存,其中最大可能的数组或结构的大小只能是 65,536 字节,但far
指针需要 32 位宽,比寄存器宽。对于这些,intptr_t
将是 32 位宽,但size_t
和ptrdiff_t
可能是 16 位宽并适合寄存器。谁知道将来会编写出什么样的操作系统呢?理论上,i386 架构提供了带有 48 位指针的 32 位分段模型,而操作系统从未实际使用过这种模型。内存偏移量的类型不能是
long
,因为太多遗留代码假设long
正好是 32 位宽。这一假设甚至被内置到 UNIX 和 Windows API 中。不幸的是,许多其他遗留代码还假设long
足够宽以容纳指针、文件偏移量、自 1970 年以来经过的秒数等。 POSIX 现在提供了一种标准化的方法来强制后一个假设而不是前一个假设为真,但这两个假设都不是可移植的。它不可能是
int
,因为在 90 年代只有极少数编译器将int
设为 64 位宽。然后他们真的很奇怪,保持long
32位宽。标准的下一个修订版声明int
比long
更宽是非法的,但int
在大多数 64 位上仍然是 32 位宽系统。它不可能是
long long int
,无论如何,它都是后来添加的,因为即使在 32 位系统上,它也被创建为至少 64 位宽。因此,需要一种新的类型。即使不是,所有其他类型都意味着数组或对象内的偏移量以外的东西。如果从 32 位到 64 位迁移的惨败中有一个教训的话,那就是具体说明一种类型需要具有哪些属性,而不是使用在不同程序中意味着不同事物的属性。
To go into why
size_t
needed to exist and how we got here:In pragmatic terms,
size_t
andptrdiff_t
are guaranteed to be 64 bits wide on a 64-bit implementation, 32 bits wide on a 32-bit implementation, and so on. They could not force any existing type to mean that, on every compiler, without breaking legacy code.A
size_t
orptrdiff_t
is not necessarily the same as anintptr_t
oruintptr_t
. They were different on certain architectures that were still in use whensize_t
andptrdiff_t
were added to the Standard in the late 1980s, and becoming obsolete when C99 added many new types but not gone yet (such as 16-bit Windows). The x86 in 16-bit protected mode had a segmented memory where the largest possible array or structure could be only 65,536 bytes in size, but afar
pointer needed to be 32 bits wide, wider than the registers. On those,intptr_t
would have been 32 bits wide butsize_t
andptrdiff_t
could be 16 bits wide and fit in a register. And who knew what kind of operating system might be written in the future? In theory, the i386 architecture offers a 32-bit segmentation model with 48-bit pointers that no operating system has ever actually used.The type of a memory offset could not be
long
because far too much legacy code assumes thatlong
is exactly 32 bits wide. This assumption was even built into the UNIX and Windows APIs. Unfortunately, a lot of other legacy code also assumed that along
is wide enough to hold a pointer, a file offset, the number of seconds that have elapsed since 1970, and so on. POSIX now provides a standardized way to force the latter assumption to be true instead of the former, but neither is a portable assumption to make.It couldn’t be
int
because only a tiny handful of compilers in the ’90s madeint
64 bits wide. Then they really got weird by keepinglong
32 bits wide. The next revision of the Standard declared it illegal forint
to be wider thanlong
, butint
is still 32 bits wide on most 64-bit systems.It couldn’t be
long long int
, which anyway was added later, since that was created to be at least 64 bits wide even on 32-bit systems.So, a new type was needed. Even if it weren’t, all those other types meant something other than an offset within an array or object. And if there was one lesson from the fiasco of 32-to-64-bit migration, it was to be specific about what properties a type needed to have, and not use one that meant different things in different programs.
types.h 的联机帮助页说:
The manpage for types.h says:
由于尚未有人提及,
size_t
的主要语言意义是sizeof
运算符返回该类型的值。同样,ptrdiff_t 的主要意义是从一个指针减去另一个指针将产生该类型的值。接受它的库函数这样做是因为它将允许此类函数在可能存在此类对象的系统上处理大小超过 UINT_MAX 的对象,而不会迫使调用者浪费代码在较大类型的系统上传递大于“unsigned int”的值对于所有可能的对象就足够了。Since nobody has yet mentioned it, the primary linguistic significance of
size_t
is that thesizeof
operator returns a value of that type. Likewise, the primary significance ofptrdiff_t
is that subtracting one pointer from another will yield a value of that type. Library functions that accept it do so because it will allow such functions to work with objects whose size exceeds UINT_MAX on systems where such objects could exist, without forcing callers to waste code passing a value larger than "unsigned int" on systems where the larger type would suffice for all possible objects.size_t
和int
不可互换。例如,在 64 位 Linux 上,size_t
的大小是 64 位(即sizeof(void*)
),但int
是 32 位。另请注意,
size_t
是无符号的。如果您需要签名版本,那么某些平台上有ssize_t
,它与您的示例更相关。作为一般规则,我建议在一般情况下使用
int
,并且在计算内存偏移量时仅使用size_t
/ssize_t
(使用mmap( )
例如)。size_t
andint
are not interchangeable. For instance on 64-bit Linuxsize_t
is 64-bit in size (i.e.sizeof(void*)
) butint
is 32-bit.Also note that
size_t
is unsigned. If you need signed version then there isssize_t
on some platforms and it would be more relevant to your example.As a general rule I would suggest using
int
for generic cases and only usesize_t
/ssize_t
when calculating memory offsets (withmmap()
for example).size_t
是一种无符号整数数据类型,只能分配 0 和大于 0 的整数值。它测量任何对象大小的字节,并由sizeof
运算符返回。const
是size_t
的语法表示,但是没有const
也可以运行该程序。size_t
经常用于数组索引和循环计数。如果编译器是32位
,它将在unsigned int
上工作。如果编译器是64位
,它也可以在unsigned long long int
上工作。最大大小为size_t
,具体取决于编译器类型。size_t
已在
头文件中定义,但也可以由
、
、
、 ;
和
标头。示例(使用
const
)输出:
size = 800
示例(不使用
const
)输出:
大小=800
size_t
is an unsigned integer data type which can assign only 0 and greater than 0 integer values. It measure bytes of any object's size and is returned bysizeof
operator.const
is the syntax representation ofsize_t
, but withoutconst
you can run the program.size_t
regularly used for array indexing and loop counting. If the compiler is32-bit
it would work onunsigned int
. If the compiler is64-bit
it would work onunsigned long long int
also. There for maximum size ofsize_t
depending on the compiler type.size_t
already defined in the<stdio.h>
header file, but it can also be defined by the<stddef.h>
,<stdlib.h>
,<string.h>
,<time.h>
, and<wchar.h>
headers.Example (with
const
)Output:
size = 800
Example (without
const
)Output:
size = 800
size_t
是一个 typedef,用于表示任何对象的大小(以字节为单位)。 (Typedef 用于为另一种数据类型创建附加名称/别名,但不会创建新类型。)在
stddef.h
中找到它的定义,如下所示:size_t
也在
中定义。size_t
被 sizeof 运算符用作返回类型。使用
size_t
与 sizeof 结合使用,定义数组大小参数的数据类型,如下所示:size_t
保证足够大以包含最大对象的大小主机系统可以处理。请注意,数组的大小限制实际上是编译和执行此代码的系统堆栈大小限制的一个因素。您应该能够在链接时调整堆栈大小(请参阅 ld 命令的 --
stack-size 参数)。
让您了解大概的堆栈大小:
许多 C 库函数,例如
malloc
、memcpy
和strlen< /code> 声明它们的参数并返回类型为
size_t
。size_t 使程序员能够通过添加/减去所需元素的数量而不是使用字节偏移量来处理不同类型。
让我们通过检查它在 C 字符串和整数数组的指针算术运算中的用法来更深入地了解
size_t
可以为我们做些什么:这是一个使用 C 字符串的示例:
这对理解没有太大帮助使用
size_t
的好处,因为无论您的架构如何,字符都是一个字节。当我们处理数字类型时,
size_t
变得非常有用。size_t
类型就像一个整数,优点是可以保存物理内存地址;该地址根据其执行平台的类型而改变其大小。以下是我们在传递 int 数组时如何利用 sizeof 和 size_t:
上面,我们看到 int 占用 4 个字节(并且由于每个字节有 8 位,因此 int 占用 32 位)。
如果我们要创建一个 long 数组,我们会发现在 linux64 操作系统上 long 需要 64 位,但只有 Win64 系统上的 32 位。因此,使用
t_size
将节省大量编码和潜在的错误,特别是在不同架构上运行执行地址算术的C代码时。所以这个故事的寓意是“使用
size_t
并让你的 C 编译器完成容易出错的指针算术工作。”size_t
is a typedef which is used to represent the size of any object in bytes. (Typedefs are used to create an additional name/alias for another data type, but does not create a new type.)Find it defined in
stddef.h
as follows:size_t
is also defined in the<stdio.h>
.size_t
is used as the return type by the sizeof operator.Use
size_t
, in conjunction with sizeof, to define the data type of the array size argument as follows:size_t
is guaranteed to be big enough to contain the size of the biggest object the host system can handle.Note that an array's size limitation is really a factor the system's stack size limitations where this code is compiled and executed. You should be able to adjust the stack size at link time (see
ld
commands's --stack-size
parameter).To give you an idea of approximate stack sizes:
Many C library functions like
malloc
,memcpy
andstrlen
declare their arguments and return type assize_t
.size_t
affords the programmer with the ability to deal with different types, by adding/subtracting the number of elements required instead of using the offset in bytes.Let's get a deeper appreciate for what
size_t
can do for us by examining its usage in pointer arithmetic operations of a C string and an integer array:Here's an example using a C string:
That's not very helpful in understanding the benefits of using
size_t
since a character is one byte, regardless of your architecture.When we're dealing with numerical types,
size_t
becomes very beneficial.size_t
type is like an integer with benefits that can hold a physical memory address; That address changes its size according to the type of platform in which it is executed.Here's how we can leverage sizeof and size_t when passing an array of ints:
Above, we see than an int takes 4 bytes (and since there are 8 bits per byte, an int occupies 32 bits).
If we were to create an array of longs we'd discover that a long takes 64 bits on a linux64 operating system, but only 32 bits on a Win64 system. Hence, using
t_size
, will save a lot of coding and potential bugs, especially when running C code that performs Address Arithmetic on different architectures.So the moral of this story is "Use
size_t
and let your C-compiler do the error-prone work of pointer arithmetic."size_t 是无符号整数数据类型。在使用 GNU C 库的系统上,这将为 unsigned int 或 unsigned long int。 size_t 通常用于数组索引和循环计数。
size_t is unsigned integer data type. On systems using the GNU C Library, this will be unsigned int or unsigned long int. size_t is commonly used for array indexing and loop counting.
一般来说,如果从 0 开始向上,请始终使用无符号类型,以避免溢出导致负值情况。这非常重要,因为如果您的数组边界碰巧小于循环的最大值,但循环的最大值碰巧大于类型的最大值,您将环绕负数,并且可能会遇到 分段错误 (SIGSEGV)。因此,一般来说,切勿将 int 用于从 0 开始并向上的循环。使用无符号。
In general, if you are starting at 0 and going upward, always use an unsigned type to avoid an overflow taking you into a negative value situation. This is critically important, because if your array bounds happens to be less than the max of your loop, but your loop max happens to be greater than the max of your type, you will wrap around negative and you may experience a segmentation fault (SIGSEGV). So, in general, never use int for a loop starting at 0 and going upwards. Use an unsigned.
这是特定于平台的
typedef
。例如,在特定计算机上,它可能是unsigned int
或unsigned long
。您应该使用此定义来提高代码的可移植性。This is a platform-specific
typedef
. For example, on a particular machine, it might beunsigned int
orunsigned long
. You should use this definition for more portability of your code.size_t 或任何无符号类型都可能被视为用作循环变量,因为循环变量通常大于或等于 0。
当我们使用 size_t 对象时,我们必须确保在使用它的所有上下文中,包括算术,我们只需要非负值。例如,下面的程序肯定会给出意想不到的结果:
size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.
When we use a size_t object, we have to make sure that in all the contexts it is used, including arithmetic, we want only non-negative values. For instance, following program would definitely give the unexpected result:
根据我的理解,size_t 是一个无符号整数,其位大小足以容纳本机体系结构的指针。
所以:
From my understanding,
size_t
is anunsigned
integer whose bit size is large enough to hold a pointer of the native architecture.So: