我有兴趣了解计算机如何为物理内存与虚拟内存(例如硬盘驱动器上)中的文件分配变量,即计算机如何确定将数据放在哪里。在这两种内存存储类型中,它几乎看起来是随机的,但这并不是因为它根本无法将数据放在已经占用或分配给另一个进程的硬盘驱动器的内存地址或扇区(任何位置)处。当我在我的旧 W95 系统上研究诺顿的高速磁盘(一种对硬盘上的文件进行碎片整理的程序)如何在我的旧 W95 系统上时,我注意到该程序对硬盘驱动器数据的表示(不同数据类型的颜色编码可视化映射,例如交换文件)总是在顶部。),由分布在整个硬盘驱动器上的许多文件组成,其中有空的未使用区域。除了其中一些区域之外,我还看到数据和空白区域的混合体显示出参差不齐的图案。我想认为这种情况的发生是随机的。同样,当我研究用 C 编写的一个简单程序的内存地址时,我注意到程序的每个版本在更改后重新编译后 - 显示不同的段地址和偏移量。当我重新编译它时,我希望计算机使用相同的地址。有时会使用相同的地址,有时会使用不同的地址。同样,我想认为程序选择的内存位置也是随机的。我认为内存分配或文件写入是基于第一个可用的空白空间,以连续的方式写入。
所以我的问题是,我想知道普通计算机的逻辑工作原理是什么,它决定它以任意方式为任一类型的位置(物理 RAM 或动态)写入数据的位置?我需要学习计算机科学的哪个领域(如果不是汇编语言)才能解释这种几乎随机的行为?
提前致谢
Im interested in understanding how a computer allocates variables for physical memory vs files in virtual memory ( such as on a hard drive ), in terms of how does the computer determine know where to put data. It almost seems random in both memory storage types, but its not because it simply can't put data at a memory address or sector (any location) of a hard drive that's occupied or allocated for another process already. When I was studying how Norton's speed disk ( a program that de-fragments files on hard drives ) on my old W95 system, I noticed from the program's representation of hard drive's data ( a color coded visual map of different data types, e.g. swap files were always first at the top.), consisting of many files spread out all over the hard drive with empty unused areas. In addition some of these areas, I saw what looked like a mix of data and empty space showed a spotty pattern. I want to think its random for that to happen. Like wise, when I was studying the memory addresses of a simple program I wrote in C, I noticed that each version of my program after recompiling it after changes - showed different addresses for segments and offsets. I was expecting the computer to use the same address when I recompiled it. Sometimes the same address would be used, other times it was different. Again, I want to think its random also for memory locations to be chosen by programs. I thought that memory allocation or file writing was based on the first empty space available, written in a contiguous manner.
So my question is, I want to know how and what is it in the logic works of a common computer, that decides where it writes its data in such a arbitrary manner for either type of location (physical RAM or Dynamic )? What area of computer science (if not assembly language) would I need to study that would explains this, almost random behavior?
Thanks in Advance
发布评论
评论(2)
更广泛且直接来自计算机科学的东西是链表。 http://en.wikipedia.org/wiki/Linked_list
想象一下,如果您有一个链接列表并且只需将项目添加到末尾,这些项目可能会线性存在于内存或磁盘或其他地方。但是,当您通过将项目编号 7 指向项目编号 9 来删除列表中间的某些项目时,消除项目编号 8。与分配的内存分配或虚拟内存或硬盘驱动器扇区分配等一样,您对存储进行碎片化的速度有多快与您用于分配下一项的算法有关。
文件系统可以/确实使用链接列表类型方案来跟踪哪些扇区与单个文件相关。使用链接列表快速且容易,但处理碎片问题。一种慢得多的方法是没有碎片,但不断复制/移动文件以将它们保留在线性扇区上。
malloc() 分配方案和 MMU 分配方案也属于这一类。基本上,任何时候你拿东西时,都会将其分割成分数,并在这些分数前面放置一个虚拟界面,以使程序员/用户感觉它们是线性的。 Malloc()(不通过 MMU 计算虚拟内存)是分配这些分数的许多线性块以满足分配需求的另一种方式,并具有尝试保留尽可能多的可用大块的分配/释放方案,以防万一,一个糟糕的 malloc 系统是指您有一半的内存空闲,但在没有内存不足错误的情况下工作的最大 malloc 是该内存的一小部分的 malloc,假设您有一个空闲的内存,并且只能分配4096 字节。
Something broader and directly from computer science would be a linked list. http://en.wikipedia.org/wiki/Linked_list
Imagine if you had a linked list and simply added items to the end, these items might live linearly in memory or disk or whatever somewhere. But as you remove some items in the middle of the list by having say item number 7 point at item number 9 eliminating item number 8. As with memory allocation for allocs or virtual memory or hard drive sector allocation, etc how fast you fragment your storage has to do with the algorithm you use for allocating the next item.
file systems can/do use a link list type scheme to keep track of what sectors are tied to a single file. it is fast and easy to use the link list but deal with fragmentation. A much slower method would be to have no fragmentation but be constantly copying/moving files around to keep them on linear sectors.
malloc() allocation schemes and MMU allocation schemes also fall under this category. Basically any time you take something, slice it up into fractions and put a virtual interface in front of those fractions to give the appearance to the programmer/user that they are linear. Malloc() (not counting the virtual memory via the MMU) is the other way around allocating a number of linear chunks of those fractions to meed the alloc need, and having an alloc/free scheme that attempts to keep as many large chunks available, just in case, a bad malloc system is one where you have half of your memory free but the maximum malloc that works without an out of memory error is a malloc of a small fraction of that memory, say you have a gig free and can only allocate 4096 bytes.
您应该查看虚拟内存和TLB(转换后备缓冲区) 或分页。
实现虚拟内存和分页并非易事。整个系统的性能取决于它。如果处理不当,您的系统将崩溃。
现在是清晨,所以维基百科现在必须做:https://en.m。 wikipedia.org/wiki/Translation_lookaside_buffer
编辑:
您在碎片整理中看到的那些彩色斑点是硬盘上的块。每个块都有一些指定的大小。根据您的 HDD 碎片程度,您的 HDD 部分可能如下所示:
这(上图)可以是一个应用程序/文件或多个文件的一部分;我将假设一个文件被分成多个文件以简化我的示例。在每个 * 的末尾有一个指向下一个 * 块所在的下一个位置的指针(这称为 链接列表)。您的硬盘(或内存)碎片越多,指向下一个块的指针就越多。这反过来又为下一个指针使用更多的空间,而不是为数据使用空间,结果在读取该数据时会产生更多的开销。如果这是磁盘上的文件,如果您的数据没有分组在一起(局部性原则),您将进行多次查找(这很糟糕,因为它们很慢)。当您使用碎片整理时,它会移动所有块并将其分组在一起(尽可能最好)。
操作系统决定分页和虚拟内存寻址(等等)。 TLB 是辅助此过程的硬件(高速缓存)(它将物理内存映射到虚拟内存地址以进行快速查找)。 CPU通过MMU与TLB通信
You should look at virtual memory and TLB (translation lookaside buffer) or paging.
It is not trivial to implement virtual memory and paging. The performance of your whole system depends on it. If it's not done properly your system will thrash.
It is early morning here so Wikipedia will have to do for now: https://en.m.wikipedia.org/wiki/Translation_lookaside_buffer
EDIT:
Those coloured spots you saw in your defrag were chunks on your HDD. Each chunk is of some specified size. Depending on how fragmented your HDD is, you might have portions of your HDD that look like this:
This (above) could be part of one application/file or multiple files; I will assume one file is split across those to simplify my example. At the end of each * there is a pointer to the next location where the next * chunk is (this is called a linked list). The more fragmented your HDD is (or memory) the more of these pointers to next chunk you will have. This in turn uses more space for next pointers instead of using space for data and the result is more overhead when reading that data. If this is a file on disk, you will have multiple seeks (which are bad because they're slow) if your data is not grouped together (locality principle). When you use defrag, it moves and groups all chunks together (as best as it can).
The OS decides paging and virtual memory addressing (and such). TLB is a hardware (a cache) that aids this process (it maps physical memory to virtual memory addresses for fast look up). The CPU communicates with the TLB via MMU