字和字节有什么区别?

发布于 2024-12-09 17:36:03 字数 130 浏览 0 评论 0原文

我做了一些研究。 一个字节是 8 位,一个字是可以在内存中寻址的最小单位。单词的确切长度各不相同。我不明白的是有一个字节有什么意义?为什么不说8位呢?

我问了一位教授这个问题,他说现在大多数机器都是字节寻址的,但这会构成一个词吗?

I've done some research.
A byte is 8 bits and a word is the smallest unit that can be addressed on memory. The exact length of a word varies. What I don't understand is what's the point of having a byte? Why not say 8 bits?

I asked a prof this question and he said most machines these days are byte-addressable, but what would that make a word?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(14

寂寞陪衬 2024-12-16 17:36:03

字节:如今,一个字节几乎总是 8 位。然而,情况并非总是如此,并且没有“标准”或其他东西来规定这一点。由于 8 位是一个方便使用的数字,因此它成为事实上的标准。

Word处理器处理数据的自然大小(寄存器大小)。目前最常见的字长是 8、16、32 和 64 位,但其他大小也是可能的。例如,有一些36位机器,甚至12 位机器

字节是CPU的最小可寻址单元。如果要设置/清除单个位,首先需要从内存中获取相应的字节,对这些位进行混乱,然后将字节写回内存。

相比之下,字的定义是处理器一次可以进行处理(如加法和减法)的最大位块,通常是整数寄存器的宽度。该定义有点模糊,因为某些处理器可能针对不同的任务(例如整数与浮点处理)具有不同的寄存器大小,或者能够访问寄存器的一部分。字大小是大多数操作使用的最大寄存器大小。

还有一些处理器具有不同的指针大小:例如,8086 是 16 位处理器,这意味着它的寄存器是 16 位宽。但它的指针(地址)是20位宽,是通过将两个16位寄存器按一定方式组合起来计算出来的。


在某些手册和 API 中,术语“字”可能会“卡在”以前的旧大小上,并且当平台发展到支持更大的寄存器时,可能与处理器的实际当前字大小不同尺寸。例如,Intel 和 AMD x86 手册仍然使用 “word”表示 16 位,其中 DWORD(双字,32 位)和 QWORD(四字,64 位)更大尺寸。这会反映在一些 API 中,例如 Microsoft 的 WinAPI。

Byte: Today, a byte is almost always 8 bit. However, that wasn't always the case and there's no "standard" or something that dictates this. Since 8 bits is a convenient number to work with it became the de facto standard.

Word: The natural size with which a processor is handling data (the register size). The most common word sizes encountered today are 8, 16, 32 and 64 bits, but other sizes are possible. For examples, there were a few 36 bit machines, or even 12 bit machines.

The byte is the smallest addressable unit for a CPU. If you want to set/clear single bits, you first need to fetch the corresponding byte from memory, mess with the bits and then write the byte back to memory.

By contrast, one definition for word is the biggest chunk of bits with which a processor can do processing (like addition and subtraction) at a time – typically the width of an integer register. That definition is a bit fuzzy, as some processors might have different register sizes for different tasks (integer vs. floating point processing for example) or are able to access fractions of a register. The word size is the maximum register size that the majority of operations work with.

There are also a few processors which have a different pointer size: for example, the 8086 is a 16-bit processor which means its registers are 16 bit wide. But its pointers (addresses) are 20 bit wide and were calculated by combining two 16 bit registers in a certain way.


In some manuals and APIs, the term "word" may be "stuck" on a former legacy size and might differ from what's the actual, current word size of a processor when the platform evolved to support larger register sizes. For example, the Intel and AMD x86 manuals still use "word" to mean 16 bits with DWORD (double-word, 32 bit) and QWORD (quad-word, 64 bit) as larger sizes. This is then reflected in some APIs, like Microsoft's WinAPI.

爱人如己 2024-12-16 17:36:03

我不明白的是,拥有一个字节有什么意义?为什么不说 8 位?

除了从历史角度来看一个字节不一定是8位这一技术点之外,使用这个术语的原因很简单:

  • 节省精力(又名懒惰) - 说“字节”比说“八位”更容易

  • 部落主义 - 一群人喜欢使用行话/私人语言来将他们与其他人区分开来其他。

随波逐流即可。您不会通过抱怨来改变 50 多年来积累的 IT 术语和文化包袱。


当您的意思是“独立于硬件架构的 8 位”时,使用的技术上正确的术语是 八位字节< /a>.

What I don't understand is what's the point of having a byte? Why not say 8 bits?

Apart from the technical point that a byte isn't necessarily 8 bits from a historical perspective, the reasons for having a term is simple human nature:

  • economy of effort (aka laziness) - it is easier to say "byte" rather than "eight bits"

  • tribalism - groups of people like to use jargon / a private language to set them apart from others.

Just go with the flow. You are not going to change 50+ years of accumulated IT terminology and cultural baggage by complaining about it.


The technically correct term to use when you mean "8 bits independent of the hardware architecture" is octet.

呆萌少年 2024-12-16 17:36:03

BYTE

我试图从C++的角度回答这个问题。

C++ 标准将“字节”定义为“足以容纳执行环境基本字符集任何成员的可寻址数据单元”。

这意味着该字节至少由足够的相邻位组成,以适应实现的基本字符集。也就是说,可能值的数量必须等于或超过不同字符的数量。
在美国,基本字符集通常是ASCII和EBCDIC集,每个字符集可以容纳8位。
因此保证一个字节至少有 8 位。

换句话说,一个字节是存储单个字符所需的内存量。

如果您想验证 C++ 实现中的“位数”,请检查文件“limits.h”。它应该有一个如下所示的条目。

#define CHAR_BIT      8         /* number of bits in a char */

字被定义为可以由机器/系统一起处理(即一次尝试​​)的特定数量的位。
或者,我们可以说 Word 定义了单次操作中可以在 CPU 和 RAM 之间传输的数据量。

计算机中的硬件寄存器是字大小的。
字大小还定义了最大可能的内存地址(每个内存地址指向一个字节大小的内存)。

注 – 在 C++ 程序中,内存地址指向内存的一个字节而不是一个字。

BYTE

I am trying to answer this question from C++ perspective.

The C++ standard defines ‘byte’ as “Addressable unit of data large enough to hold any member of the basic character set of the execution environment.”

What this means is that the byte consists of at least enough adjacent bits to accommodate the basic character set for the implementation. That is, the number of possible values must equal or exceed the number of distinct characters.
In the United States, the basic character sets are usually the ASCII and EBCDIC sets, each of which can be accommodated by 8 bits.
Hence it is guaranteed that a byte will have at least 8 bits.

In other words, a byte is the amount of memory required to store a single character.

If you want to verify ‘number of bits’ in your C++ implementation, check the file ‘limits.h’. It should have an entry like below.

#define CHAR_BIT      8         /* number of bits in a char */

WORD

A Word is defined as specific number of bits which can be processed together (i.e. in one attempt) by the machine/system.
Alternatively, we can say that Word defines the amount of data that can be transferred between CPU and RAM in a single operation.

The hardware registers in a computer machine are word sized.
The Word size also defines the largest possible memory address (each memory address points to a byte sized memory).

Note – In C++ programs, the memory addresses points to a byte of memory and not to a word.

相守太难 2024-12-16 17:36:03

一个字是处理器中寄存器的大小。这意味着诸如 add、mul 等处理器指令都位于字大小的输入上。

但大多数现代体系结构都具有可按 8 位块寻址的内存,因此使用“字节”一词很方便。

A word is the size of the registers in the processor. This means processor instructions like, add, mul, etc are on word-sized inputs.

But most modern architectures have memory that is addressable in 8-bit chunks, so it is convenient to use the word "byte".

入画浅相思 2024-12-16 17:36:03

为什么不说8位?

因为并不是所有的机器都有8位字节。由于您标记了此 C,因此请在 limits.h 中查找 CHAR_BIT

Why not say 8 bits?

Because not all machines have 8-bit bytes. Since you tagged this C, look up CHAR_BIT in limits.h.

淑女气质 2024-12-16 17:36:03

似乎所有答案都假设高级语言,主要是 C/C++。

但问题被标记为“汇编”,并且在我所知道的所有汇编器中(对于 8 位、16 位、32 位和 64 位 CPU),定义都更加清晰:

byte  = 8 bits 
word  = 2 bytes
dword = 4 bytes = 2Words (dword means "double word")
qword = 8 bytes = 2Dwords = 4Words ("quadruple word")

It seems all the answers assume high level languages and mainly C/C++.

But the question is tagged "assembly" and in all assemblers I know (for 8bit, 16bit, 32bit and 64bit CPUs), the definitions are much more clear:

byte  = 8 bits 
word  = 2 bytes
dword = 4 bytes = 2Words (dword means "double word")
qword = 8 bytes = 2Dwords = 4Words ("quadruple word")
对岸观火 2024-12-16 17:36:03

在这种情况下,单词是机器在处理内存时使用的单位。例如,在 32 位机器上,字的长度为 32 位,而在 64 位机器上,字的长度为 64 位。字大小决定地址空间。

在编程(C/C++)中,该字通常由 int_ptr 类型表示,该类型与指针具有相同的长度,这样就抽象了这些细节。

但有些 API 可能会让您感到困惑,例如 Win32 API,因为它具有诸如 WORD(16 位)和 DWORD(32 位)等类型。原因是API最初针对16位机器,然后移植到32位机器,然后移植到64位机器。要存储指针,可以使用INT_PTR。更多详细信息此处< /a> 和 此处

In this context, a word is the unit that a machine uses when working with memory. For example, on a 32 bit machine, the word is 32 bits long and on a 64 bit is 64 bits long. The word size determines the address space.

In programming (C/C++), the word is typically represented by the int_ptr type, which has the same length as a pointer, this way abstracting these details.

Some APIs might confuse you though, such as Win32 API, because it has types such as WORD (16 bits) and DWORD (32 bits). The reason is that the API was initially targeting 16 bit machines, then was ported to 32 bit machines, then to 64 bit machines. To store a pointer, you can use INT_PTR. More details here and here.

暮凉 2024-12-16 17:36:03

单词的确切长度各不相同。我不明白的是有一个字节有什么意义?为什么不说 8 位?

尽管字的长度有所不同,但在所有现代机器甚至我熟悉的所有旧架构上,字大小仍然是字节大小的倍数。因此,就可变字长而言,使用“字节”而不是“8 位”并没有什么特别的缺点。

除此之外,这里还有一些使用字节(或八位字节1)而不是“8 位”的原因:

  1. 较大的单位可以方便地避免非常大或非常小的数字:您不妨问“为什么说3 纳秒,当你可以说 0.000000003 秒时”或“为什么要说 1 公斤,当你可以说 1,000 克时”等等。
  2. 除了方便之外,字节的单位某种程度上与 1 位一样基本,因为许多操作通常不是在位级别工作,而是在字节级别工作:寻址内存、分配动态存储、从文件或套接字读取等。
  3. 即使您采用“8 位”作为单位类型,因此您可以说“两个 8 位”而不是“两个字节”,这通常会非常令人困惑让你的新单位以数字开头。例如,如果有人说“一百 8 位”,它很容易被解释为 108 位,而不是 800 位(一百 8 位是 8 位的 100 倍)。

1 虽然对于这个答案,我认为一个字节是 8 位,但这并不普遍正确:在较旧的机器上,一个字节可能具有不同的大小(例如6 位八位字节始终表示 8 位,无论机器如何。 (所以这个术语经常用于定义网络在现代使用中,字节绝大多数用作 8 位的同义词。

The exact length of a word varies. What I don't understand is what's the point of having a byte? Why not say 8 bits?

Even though the length of a word varies, on all modern machines and even all older architectures that I'm familiar with, the word size is still a multiple of the byte size. So there is no particular downside to using "byte" over "8 bits" in relation to the variable word size.

Beyond that, here are some reasons to use byte (or octet1) over "8 bits":

  1. Larger units are just convenient to avoid very large or very small numbers: you might as well ask "why say 3 nanoseconds when you could say 0.000000003 seconds" or "why say 1 kilogram when you could say 1,000 grams", etc.
  2. Beyond the convenience, the unit of a byte is somehow as fundamental as 1 bit since many operations typically work not at the bit level, but at the byte level: addressing memory, allocating dynamic storage, reading from a file or socket, etc.
  3. Even if you were to adopt "8 bit" as a type of unit, so you could say "two 8-bits" instead of "two bytes", it would often be very confusing to have your new unit start with a number. For example, if someone said "one-hundred 8-bits" it could easily be interpreted as 108 bits, rather than 800 bits (one-hundred 8-bits is 100 times 8 bits).

1 Although I'll consider a byte to be 8 bits for this answer, this isn't universally true: on older machines a byte may have a different size (such as 6 bits. Octet always means 8 bits, regardless of the machine (so this term is often used in defining network protocols). In modern usage, byte is overwhelmingly used as synonymous with 8 bits.

虫児飞 2024-12-16 17:36:03

一组 8 位称为字节(对于某些体系结构,它不是:))

是固定大小的位组,按如下方式处理由处理器的指令集和/或硬件组成的单元。这意味着通用寄存器的大小(通常大于一个字节)是一个字。

在 C 语言中,一个字最常称为整数 => 整数

A group of 8 bits is called a byte ( with the exception where it is not :) for certain architectures )

A word is a fixed sized group of bits that are handled as a unit by the instruction set and/or hardware of the processor. That means the size of a general purpose register ( which is generally more than a byte ) is a word

In the C, a word is most often called an integer => int

盗心人 2024-12-16 17:36:03

无论数据表和编译器中出现什么术语,“字节”都是八位。我们不要试图将询问者和一般性与更模糊的例外混淆,特别是因为“字节”一词来自表达“八”。我在半导体/电子行业工作了三十多年,从未知道“字节”用于表达超过八位的任何内容。

Whatever the terminology present in datasheets and compilers, a 'Byte' is eight bits. Let's not try to confuse enquirers and generalities with the more obscure exceptions, particularly as the word 'Byte' comes from the expression "By Eight". I've worked in the semiconductor/electronics industry for over thirty years and not once known 'Byte' used to express anything more than eight bits.

参考:https://www.os -book.com/OS9/slide-dir/PPT-dir/ch1.ppt

计算机存储的基本单位是位。一个位可以包含两个之一
值 0 和 1。计算机中的所有其他存储都基于位的集合。
如果有足够的位,计算机可以表示的东西之多令人惊讶:
数字、字母、图像、电影、声音、文档和程序等等
几个。一个字节是 8 位,在大多数计算机上它是最小的方便的
存储块。例如,大多数计算机没有指令
移动一点,但确实有一个可以移动一个字节。一个不太常见的术语是单词,
这是给定计算机体系结构的本机数据单元。一个词组成了
一个或多个字节。例如,一台具有 64 位寄存器和 64 位寄存器的计算机
位存储器寻址通常具有 64 位(8 字节)字。计算机执行
许多操作以其本机字大小而不是一次一个字节为单位。
通常会测量计算机存储以及大多数计算机吞吐量
并以字节和字节集合进行操作。
千字节 (KB) 为 1,024 字节
一兆字节 (MB) 等于 1,024 2 个字节
一千兆字节 (GB) 是 1,024 3 个字节
1 TB 为 1,024 4 字节
PB 为 1,024 5 个字节
计算机制造商经常对这些数字进行四舍五入,并表示
兆字节是 100 万字节,千兆字节是 10 亿字节。联网
测量是此一般规则的一个例外;它们以位的形式给出
(因为网络一次移动一点数据)

Reference:https://www.os-book.com/OS9/slide-dir/PPT-dir/ch1.ppt

The basic unit of computer storage is the bit. A bit can contain one of two
values, 0 and 1. All other storage in a computer is based on collections of bits.
Given enough bits, it is amazing how many things a computer can represent:
numbers, letters, images, movies, sounds, documents, and programs, to name
a few. A byte is 8 bits, and on most computers it is the smallest convenient
chunk of storage. For example, most computers don’t have an instruction to
move a bit but do have one to move a byte. A less common term is word,
which is a given computer architecture’s native unit of data. A word is made up
of one or more bytes. For example, a computer that has 64-bit registers and 64-
bit memory addressing typically has 64-bit (8-byte) words. A computer executes
many operations in its native word size rather than a byte at a time.
Computer storage, along with most computer throughput, is generally measured
and manipulated in bytes and collections of bytes.
A kilobyte, or KB, is 1,024 bytes
a megabyte, or MB, is 1,024 2 bytes
a gigabyte, or GB, is 1,024 3 bytes
a terabyte, or TB, is 1,024 4 bytes
a petabyte, or PB, is 1,024 5 bytes
Computer manufacturers often round off these numbers and say that a
megabyte is 1 million bytes and a gigabyte is 1 billion bytes. Networking
measurements are an exception to this general rule; they are given in bits
(because networks move data a bit at a time)

偏闹i 2024-12-16 17:36:03

如果机器是字节可寻址的,并且字是可以在内存上寻址的最小单位,那么我猜字就是一个字节!

If a machine is byte-addressable and a word is the smallest unit that can be addressed on memory then I guess a word would be a byte!

暖阳 2024-12-16 17:36:03

BYTE 和 WORD 术语与所引用的处理器的大小相关。最常见的处理器是 8 位、16 位、32 位或 64 位。这些是处理器的字长度。实际上,无论数字长度是多少,一个 WORD 的一半就是一个 BYTE。为此,半个字节是一个半字节。

The terms of BYTE and WORD are relative to the size of the processor that is being referred to. The most common processors are/were 8 bit, 16 bit, 32 bit or 64 bit. These are the WORD lengths of the processor. Actually half of a WORD is a BYTE, whatever the numerical length is. Ready for this, half of a BYTE is a NIBBLE.

恏ㄋ傷疤忘ㄋ疼 2024-12-16 17:36:03

事实上,在常见用法中,字已成为 16 位的同义词,就像字节与 8 位的同义词一样。可能会有点令人困惑,因为 32 位 CPU 上的“字大小”是 32 位,但当谈论一个数据字时,则意味着 16 位。具有 32 位字大小的微控制器已开始将其指令称为“long”(据说是为了避免字/双字混淆)。

In fact, in common usage, word has become synonymous with 16 bits, much like byte has with 8 bits. Can get a little confusing since the "word size" on a 32-bit CPU is 32-bits, but when talking about a word of data, one would mean 16-bits. Microcontrollers with a 32-bit word size have taken to calling their instructions "longs" (supposedly to try and avoid the word/doubleword confusion).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文