使用结构指针访问 mmap 区域

发布于 2024-11-16 16:01:29 字数 1026 浏览 3 评论 0原文

如果我通过有漏洞的结构类型的指针访问文件的内存映射,它可能不会将结构元素映射到正确的数据。例如。

#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>

typedef union{
    int a;
    char c[4];
}INT;

typedef struct{
    char type;
    INT data;
}RECORD;

int main(){
    int fd;
    RECORD *recPtr;
    fd = open("./f1", O_RDWR);
    if (fd == -1){
            printf("Open Failed!\n");
    }
    printf("Size of RECORD: %d\n", sizeof(RECORD));
    recPtr = (RECORD *)mmap(0, 2*sizeof(RECORD), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (recPtr == MAP_FAILED){
            printf("Map Filaed!\n");
    }
    printf("type: %c, data: %c%c%c%c\n", recPtr->type, recPtr->data.c[0], recPtr->data.c[1], recPtr->data.c[2], recPtr->data.c[3]);
}

如果文件“f1”包含以下数据:

012345678

上述程序给出的输出是

Size of RECORD: 8
type: 0, data: 4567

因为字符 123 被结构孔吃掉了。

有没有办法在不使用 pragma pack 指令且不更改结构中元素的顺序的情况下避免这种情况。

Its possible that if I access memory map of a file, via pointer of a structure type which has hole, it may not map the structure elements to correct data. For eg.

#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>

typedef union{
    int a;
    char c[4];
}INT;

typedef struct{
    char type;
    INT data;
}RECORD;

int main(){
    int fd;
    RECORD *recPtr;
    fd = open("./f1", O_RDWR);
    if (fd == -1){
            printf("Open Failed!\n");
    }
    printf("Size of RECORD: %d\n", sizeof(RECORD));
    recPtr = (RECORD *)mmap(0, 2*sizeof(RECORD), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    if (recPtr == MAP_FAILED){
            printf("Map Filaed!\n");
    }
    printf("type: %c, data: %c%c%c%c\n", recPtr->type, recPtr->data.c[0], recPtr->data.c[1], recPtr->data.c[2], recPtr->data.c[3]);
}

If the file "f1" contains the following data:

012345678

The above programs gives the output as

Size of RECORD: 8
type: 0, data: 4567

since the characters 123 are eaten up by the structure holes.

Is there a way to avoid this without using pragma pack directive and without changing the ordering of elements in the structure.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

请你别敷衍 2024-11-23 16:01:29

您基本上有以下选择:

  1. 接受填充。只要您的数据不需要跨架构移植,这就很好(也是最快的选择)。
  2. 使用 __attribute__((packed)) 或类似于编译器插入的控制填充(推荐,但要求您使用编译器扩展)
  3. 在字节级别手动访问,而不使用结构。例如:

    字符类型;
    整数数据;
    
    memcpy(&type, ((char *)recPtr), 1);
    memcpy(&data, ((char *)recPtr) + 1, sizeof(data));
    

You basically have the following options:

  1. Accept the padding. This is fine (and the fastest option) as long as your data does not need to be portable across architectures.
  2. Use __attribute__((packed)) or similar to control padding inserted by the compiler (recommended, but requires that you use compiler extensions)
  3. Manually access at the byte level, without using structs. Eg:

    char type;
    int data;
    
    memcpy(&type, ((char *)recPtr), 1);
    memcpy(&data, ((char *)recPtr) + 1, sizeof(data));
    
撑一把青伞 2024-11-23 16:01:29

直接将二进制数据读入结构会导致灾难。这意味着您在未经验证的情况下对某些输入的结构做出假设;当然,您可以事后检查结构的完整性。但通常情况下,您必须对输入数据进行架构相关的调整。考虑低端字节序与大端字节序。不同的字长、打包规则等。

长话短说:不要陷入黑暗面,它是快速破解的诱人承诺。

读取文件的唯一正确方法是逐个八位组读取它;当然,您可以在缓冲区中读取更大的块,但是您应该通过查看每个位来处理它们。如果您担心性能,您应该阅读“计算机编程的艺术”的第 1 卷和第 4 卷中已发布的内容,其中深入解释了如何在不忽略任何数据的情况下有效地处理数据流。

或者使用 Google 的协议缓冲区。

Reading binary data directly into structures is a recipe for disaster. It means you're making assumptions about the structure of some input without verification; of course you could check the structure for integrity afterwards. But more often than not you'll have to do architecture dependent adjustments to the input data. Think low endian vs. big endian. Different word lengths, packing rules, etc.

To make a long story short: Don't fall for the dark side and it's seducing promise of quick hacks.

The only proper way to read a file is reading it octet by octet; you can read larger chunks in a buffer of course, but you should then process them by looking at each single bit. If you worry about performance you should read Volume 1 and what's been released to far of Volume 4 of "The Art of Computer Programming" which in depth explains how to process data streams efficiently without neglecting any data.

Or use Google's Protocol Buffers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文