如何知道写时复制页面是否是实际副本？

发布于 2024-10-07 22:05:42 字数 164 浏览 10 评论 0原文

当我使用 mmap 创建写时复制映射（MAP_PRIVATE）时，一旦我写入特定地址，该映射的某些页面就会被复制。在我的程序中的某个时刻，我想弄清楚哪些页面实际上已被复制。有一个称为“mincore”的调用，但它仅报告页面是否在内存中，这与是否复制页面不同。

有什么方法可以找出哪些页面已被复制吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

街角卖回忆 2024-10-14 22:05:43

很好，遵循 MarkR，我尝试过浏览 pagemap 和 kpageflags 界面。下面是一个快速测试，用于检查页面在调用时是否在内存中“交换”。当然还有一个问题，那就是 kpageflags 只能由 root 访问。

int main(int argc, char* argv[])
{
  unsigned long long pagesize=getpagesize();
  assert(pagesize>0);
  int pagecount=4;
  int filesize=pagesize*pagecount;
  int fd=open("test.dat", O_RDWR);
  if (fd<=0)
    {
      fd=open("test.dat", O_CREAT|O_RDWR,S_IRUSR|S_IWUSR);
      printf("Created test.dat testfile\n");
    }
  assert(fd);
  int err=ftruncate(fd,filesize);
  assert(!err);

  char* M=(char*)mmap(NULL, filesize, PROT_READ|PROT_WRITE, MAP_PRIVATE,fd,0);
  assert(M!=(char*)-1);
  assert(M);
  printf("Successfully create private mapping\n");

测试设置包含 4 页。页 0 和 2 是脏

  strcpy(M,"I feel so dirty\n");
  strcpy(M+pagesize*2,"Christ on crutches\n");

页 3 已被读取。

  char t=M[pagesize*3];

第 1 页将不会被访问

页面映射文件将进程的虚拟内存映射到实际页面，然后可以从全局 kpageflags 文件中检索这些页面。 读取文件 /usr/src/linux/Documentation/vm/pagemap.txt

  int mapfd=open("/proc/self/pagemap",O_RDONLY);
  assert(mapfd>0);
  unsigned long long target=((unsigned long)(void*)M)/pagesize;
  err=lseek64(mapfd, target*8, SEEK_SET);
  assert(err==target*8);
  assert(sizeof(long long)==8);

这里我们读取每个虚拟页面的页框编号

  unsigned long long page2pfn[pagecount];
  err=read(mapfd,page2pfn,sizeof(long long)*pagecount);
  if (err<0)
    perror("Reading pagemap");
  if(err!=pagecount*8)
    printf("Could only read %d bytes\n",err);

现在我们将读取每个虚拟页面的实际页框编号pageflags

  int pageflags=open("/proc/kpageflags",O_RDONLY);
  assert(pageflags>0);
  for(int i = 0 ; i < pagecount; i++)
    {
      unsigned long long v2a=page2pfn[i];
      printf("Page: %d, flag %llx\n",i,page2pfn[i]);

      if(v2a&0x8000000000000000LL) // Is the virtual page present ?
        {
        unsigned long long pfn=v2a&0x3fffffffffffffLL;
        err=lseek64(pageflags,pfn*8,SEEK_SET);
        assert(err==pfn*8);
        unsigned long long pf;
        err=read(pageflags,&pf,8);
        assert(err==8);
        printf("pageflags are %llx with SWAPBACKED: %d\n",pf,(pf>>14)&1);
        }
    }
}

总而言之，我对这种方法不是特别满意，因为它需要访问我们通常无法访问的文件，而且非常复杂（一个简单的内核调用来检索 pageflags 怎么样？）。

Good, following the advice of MarkR, I gave it a shot to go through the pagemap and kpageflags interface. Below a quick test to check whether a page is in memory 'SWAPBACKED' as it is called. One problem remains of course, which is the problem that kpageflags is only accessible to the root.

int main(int argc, char* argv[])
{
  unsigned long long pagesize=getpagesize();
  assert(pagesize>0);
  int pagecount=4;
  int filesize=pagesize*pagecount;
  int fd=open("test.dat", O_RDWR);
  if (fd<=0)
    {
      fd=open("test.dat", O_CREAT|O_RDWR,S_IRUSR|S_IWUSR);
      printf("Created test.dat testfile\n");
    }
  assert(fd);
  int err=ftruncate(fd,filesize);
  assert(!err);

  char* M=(char*)mmap(NULL, filesize, PROT_READ|PROT_WRITE, MAP_PRIVATE,fd,0);
  assert(M!=(char*)-1);
  assert(M);
  printf("Successfully create private mapping\n");

The test setup contains 4 pages. page 0 and 2 are dirty

  strcpy(M,"I feel so dirty\n");
  strcpy(M+pagesize*2,"Christ on crutches\n");

page 3 has been read from.

  char t=M[pagesize*3];

page 1 will not be accessed

The pagemap file maps the process its virtual memory to actual pages, which can then be retrieved from the global kpageflags file later on. Read the file /usr/src/linux/Documentation/vm/pagemap.txt

  int mapfd=open("/proc/self/pagemap",O_RDONLY);
  assert(mapfd>0);
  unsigned long long target=((unsigned long)(void*)M)/pagesize;
  err=lseek64(mapfd, target*8, SEEK_SET);
  assert(err==target*8);
  assert(sizeof(long long)==8);

Here we read the page frame numbers for each of our virtual pages

  unsigned long long page2pfn[pagecount];
  err=read(mapfd,page2pfn,sizeof(long long)*pagecount);
  if (err<0)
    perror("Reading pagemap");
  if(err!=pagecount*8)
    printf("Could only read %d bytes\n",err);

Now we are about to read for each virtual frame, the actual pageflags

  int pageflags=open("/proc/kpageflags",O_RDONLY);
  assert(pageflags>0);
  for(int i = 0 ; i < pagecount; i++)
    {
      unsigned long long v2a=page2pfn[i];
      printf("Page: %d, flag %llx\n",i,page2pfn[i]);

      if(v2a&0x8000000000000000LL) // Is the virtual page present ?
        {
        unsigned long long pfn=v2a&0x3fffffffffffffLL;
        err=lseek64(pageflags,pfn*8,SEEK_SET);
        assert(err==pfn*8);
        unsigned long long pf;
        err=read(pageflags,&pf,8);
        assert(err==8);
        printf("pageflags are %llx with SWAPBACKED: %d\n",pf,(pf>>14)&1);
        }
    }
}

All in all, I'm not particularly happy with this approach since it requires access to a file that we in general can't access and it is bloody complicated (how about a simple kernel call to retrieve the pageflags ?).

回复收藏 0 原文