如何知道写时复制页面是否是实际副本?

发布于 2024-10-07 22:05:42 字数 164 浏览 10 评论 0原文

当我使用 mmap 创建写时复制映射(MAP_PRIVATE)时,一旦我写入特定地址,该映射的某些页面就会被复制。在我的程序中的某个时刻,我想弄清楚哪些页面实际上已被复制。有一个称为“mincore”的调用,但它仅报告页面是否在内存中,这与是否复制页面不同。

有什么方法可以找出哪些页面已被复制吗?

When I create a copy-on-write mapping (a MAP_PRIVATE) using mmap, then some pages of this mapping will be copied as soon as I write to specific addresses. At a certain point in my program I would like to figure out which pages have actually been copied. There is a call, called 'mincore', but that only reports whether the page is in memory or not, which is not the same as the page being copied or not.

Is there some way to figure out which pages have been copied ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

街角卖回忆 2024-10-14 22:05:43

很好,遵循 MarkR,我尝试过浏览 pagemap 和 kpageflags 界面。下面是一个快速测试,用于检查页面在调用时是否在内存中“交换​​”。当然还有一个问题,那就是 kpageflags 只能由 root 访问。

int main(int argc, char* argv[])
{
  unsigned long long pagesize=getpagesize();
  assert(pagesize>0);
  int pagecount=4;
  int filesize=pagesize*pagecount;
  int fd=open("test.dat", O_RDWR);
  if (fd<=0)
    {
      fd=open("test.dat", O_CREAT|O_RDWR,S_IRUSR|S_IWUSR);
      printf("Created test.dat testfile\n");
    }
  assert(fd);
  int err=ftruncate(fd,filesize);
  assert(!err);

  char* M=(char*)mmap(NULL, filesize, PROT_READ|PROT_WRITE, MAP_PRIVATE,fd,0);
  assert(M!=(char*)-1);
  assert(M);
  printf("Successfully create private mapping\n");

测试设置包含 4 页。页 0 和 2 是脏

  strcpy(M,"I feel so dirty\n");
  strcpy(M+pagesize*2,"Christ on crutches\n");

页 3 已被读取。

  char t=M[pagesize*3];

第 1 页将不会被访问

页面映射文件将进程的虚拟内存映射到实际页面,然后可以从全局 kpageflags 文件中检索这些页面。 读取文件 /usr/src/linux/Documentation/vm/pagemap.txt

  int mapfd=open("/proc/self/pagemap",O_RDONLY);
  assert(mapfd>0);
  unsigned long long target=((unsigned long)(void*)M)/pagesize;
  err=lseek64(mapfd, target*8, SEEK_SET);
  assert(err==target*8);
  assert(sizeof(long long)==8);

这里我们读取每个虚拟页面的页框编号

  unsigned long long page2pfn[pagecount];
  err=read(mapfd,page2pfn,sizeof(long long)*pagecount);
  if (err<0)
    perror("Reading pagemap");
  if(err!=pagecount*8)
    printf("Could only read %d bytes\n",err);

现在我们将读取每个虚拟页面的实际页框编号pageflags

  int pageflags=open("/proc/kpageflags",O_RDONLY);
  assert(pageflags>0);
  for(int i = 0 ; i < pagecount; i++)
    {
      unsigned long long v2a=page2pfn[i];
      printf("Page: %d, flag %llx\n",i,page2pfn[i]);

      if(v2a&0x8000000000000000LL) // Is the virtual page present ?
        {
        unsigned long long pfn=v2a&0x3fffffffffffffLL;
        err=lseek64(pageflags,pfn*8,SEEK_SET);
        assert(err==pfn*8);
        unsigned long long pf;
        err=read(pageflags,&pf,8);
        assert(err==8);
        printf("pageflags are %llx with SWAPBACKED: %d\n",pf,(pf>>14)&1);
        }
    }
}

总而言之,我对这种方法不是特别满意,因为它需要访问我们通常无法访问的文件,而且非常复杂(一个简单的内核调用来检索 pageflags 怎么样?)。

Good, following the advice of MarkR, I gave it a shot to go through the pagemap and kpageflags interface. Below a quick test to check whether a page is in memory 'SWAPBACKED' as it is called. One problem remains of course, which is the problem that kpageflags is only accessible to the root.

int main(int argc, char* argv[])
{
  unsigned long long pagesize=getpagesize();
  assert(pagesize>0);
  int pagecount=4;
  int filesize=pagesize*pagecount;
  int fd=open("test.dat", O_RDWR);
  if (fd<=0)
    {
      fd=open("test.dat", O_CREAT|O_RDWR,S_IRUSR|S_IWUSR);
      printf("Created test.dat testfile\n");
    }
  assert(fd);
  int err=ftruncate(fd,filesize);
  assert(!err);

  char* M=(char*)mmap(NULL, filesize, PROT_READ|PROT_WRITE, MAP_PRIVATE,fd,0);
  assert(M!=(char*)-1);
  assert(M);
  printf("Successfully create private mapping\n");

The test setup contains 4 pages. page 0 and 2 are dirty

  strcpy(M,"I feel so dirty\n");
  strcpy(M+pagesize*2,"Christ on crutches\n");

page 3 has been read from.

  char t=M[pagesize*3];

page 1 will not be accessed

The pagemap file maps the process its virtual memory to actual pages, which can then be retrieved from the global kpageflags file later on. Read the file /usr/src/linux/Documentation/vm/pagemap.txt

  int mapfd=open("/proc/self/pagemap",O_RDONLY);
  assert(mapfd>0);
  unsigned long long target=((unsigned long)(void*)M)/pagesize;
  err=lseek64(mapfd, target*8, SEEK_SET);
  assert(err==target*8);
  assert(sizeof(long long)==8);

Here we read the page frame numbers for each of our virtual pages

  unsigned long long page2pfn[pagecount];
  err=read(mapfd,page2pfn,sizeof(long long)*pagecount);
  if (err<0)
    perror("Reading pagemap");
  if(err!=pagecount*8)
    printf("Could only read %d bytes\n",err);

Now we are about to read for each virtual frame, the actual pageflags

  int pageflags=open("/proc/kpageflags",O_RDONLY);
  assert(pageflags>0);
  for(int i = 0 ; i < pagecount; i++)
    {
      unsigned long long v2a=page2pfn[i];
      printf("Page: %d, flag %llx\n",i,page2pfn[i]);

      if(v2a&0x8000000000000000LL) // Is the virtual page present ?
        {
        unsigned long long pfn=v2a&0x3fffffffffffffLL;
        err=lseek64(pageflags,pfn*8,SEEK_SET);
        assert(err==pfn*8);
        unsigned long long pf;
        err=read(pageflags,&pf,8);
        assert(err==8);
        printf("pageflags are %llx with SWAPBACKED: %d\n",pf,(pf>>14)&1);
        }
    }
}

All in all, I'm not particularly happy with this approach since it requires access to a file that we in general can't access and it is bloody complicated (how about a simple kernel call to retrieve the pageflags ?).

我要还你自由 2024-10-14 22:05:43

我通常使用 mprotect 将跟踪的写时复制页面设置为只读,然后通过将给定页面标记为脏并启用写入来处理生成的 SIGSEGV。

它并不理想,但开销是相当可管理的,并且可以与 mincore 等结合使用来进行更复杂的优化,例如管理工作集大小或近似页面的指针信息您希望进行换出,这可以让运行时系统与内核合作而不是与之对抗。

I usually use mprotect to set my tracked copy-on-write pages to read-only, then handle the resulting SIGSEGVs by marking the given page dirty and enabling writing.

It isn't ideal, but the overhead is quite manageable and it can be used in combination with mincore, etc. to do more complicated optimizations, like manage your working set size or to approximate pointer information for pages you expect to have swap out, which lets the runtime system cooperate with the kernel rather than fight it.

自由如风 2024-10-14 22:05:43

确定这一点并不容易,但却是可能的。为了查明一个页面是否是另一个页面(可能是另一个进程的)的副本,那么您需要执行以下操作(最近的内核):

  1. 读取 /proc/pid/pagemap 中进程中相应页面的条目(例如)
  2. 询问 /proc/kpageflags

然后您可以确定内存中的两个页面实际上是同一页面。

做到这一点相当棘手,你需要成为 root,无论你做什么,都可能会出现一些竞争条件,但这是可能的。

It is not easy, but possible to determine this. In order to find out whether a page is a copy of another page (possibly another process's) then you need to do the following (recentish kernels):

  1. Read the entry in /proc/pid/pagemap for the appropriate pages in the process(es)
  2. Interrogate /proc/kpageflags

You can then determine that two pages are actually the same page, in memory.

It is fairly tricky to do this, you need to be root, and whatever you do will probably have some race conditions in it, but it is possible.

绝不放开 2024-10-14 22:05:43

写时复制是使用虚拟内存硬件的内存保护方案来实现的。

当写入只读页时,会发生页错误。页错误处理程序检查该页是否带有写时复制标志:如果是,则分配新页,复制旧页的内容,然后重试写入。

新页面既不是只读的也不是写时复制,到原始页面的链接完全断开。

因此,您需要做的就是测试页面的内存保护标志。

在 Windows 上,API 为 GetWorkingSet,请参阅 中的说明VirtualQueryEx。我不知道对应的linux API是什么。

Copy-on-write is implemented using the memory protection scheme of the virtual memory hardware.

When a read-only page is written to, a page fault occurs. The page fault handler checks if the page carries the copy-on-write flag: if so, a new page is allocated, the contents of the old page and copied, and the write is retried.

The new page is neither read-only nor copy-on-write, the link to the original page is completely broken.

So all you need to do is test the memory protection flags for the page.

On Windows, the API is GetWorkingSet, see the explanation at VirtualQueryEx. I don't know what the corresponding linux API is.

阳光的暖冬 2024-10-14 22:05:43

我给了一个类似的人的答案 目标并引用了与您类似的问题。

我认为 bmargulies 的 当这两个想法结合起来时,这个问题的答案完全符合您的需要。

I gave an answer to someone with a similar goal and referenced a question similar to yours.

I think bmargulies' answer to that question fits what you need perfectly when the two ideas are combined.

唔猫 2024-10-14 22:05:43

我不记得有这样的 API 被导出。你为什么要做这样的事情(你正在解决的问题的根源是什么?)

你可能想看看 /proc/[pid]/smaps (它提供了所使用页面的详细统计信息/复制/存储)。

再说一次,你为什么要这么做?如果您确定这种方法是唯一的方法(通常,使用了虚拟内存并忘记了),您可能需要考虑编写一个处理此类功能的内核模块。

I don't recall such API being exported. Why do you want to do such a thing (What is the root of the problem you're solving?)

You might want to take a look at /proc/[pid]/smaps (which provides a somewhat detailed statistic of pages used/copied/stored).

Again, why would you want to do that? If you're sure this approach is the only one (usually, virtual memory is used and forgot about), you might want to consider writing a kernel module that handles such functionality.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文