使用结构指针访问 mmap 区域
如果我通过有漏洞的结构类型的指针访问文件的内存映射,它可能不会将结构元素映射到正确的数据。例如。
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
typedef union{
int a;
char c[4];
}INT;
typedef struct{
char type;
INT data;
}RECORD;
int main(){
int fd;
RECORD *recPtr;
fd = open("./f1", O_RDWR);
if (fd == -1){
printf("Open Failed!\n");
}
printf("Size of RECORD: %d\n", sizeof(RECORD));
recPtr = (RECORD *)mmap(0, 2*sizeof(RECORD), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (recPtr == MAP_FAILED){
printf("Map Filaed!\n");
}
printf("type: %c, data: %c%c%c%c\n", recPtr->type, recPtr->data.c[0], recPtr->data.c[1], recPtr->data.c[2], recPtr->data.c[3]);
}
如果文件“f1”包含以下数据:
012345678
上述程序给出的输出是
Size of RECORD: 8
type: 0, data: 4567
因为字符 123 被结构孔吃掉了。
有没有办法在不使用 pragma pack 指令且不更改结构中元素的顺序的情况下避免这种情况。
Its possible that if I access memory map of a file, via pointer of a structure type which has hole, it may not map the structure elements to correct data. For eg.
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
typedef union{
int a;
char c[4];
}INT;
typedef struct{
char type;
INT data;
}RECORD;
int main(){
int fd;
RECORD *recPtr;
fd = open("./f1", O_RDWR);
if (fd == -1){
printf("Open Failed!\n");
}
printf("Size of RECORD: %d\n", sizeof(RECORD));
recPtr = (RECORD *)mmap(0, 2*sizeof(RECORD), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (recPtr == MAP_FAILED){
printf("Map Filaed!\n");
}
printf("type: %c, data: %c%c%c%c\n", recPtr->type, recPtr->data.c[0], recPtr->data.c[1], recPtr->data.c[2], recPtr->data.c[3]);
}
If the file "f1" contains the following data:
012345678
The above programs gives the output as
Size of RECORD: 8
type: 0, data: 4567
since the characters 123 are eaten up by the structure holes.
Is there a way to avoid this without using pragma pack directive and without changing the ordering of elements in the structure.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您基本上有以下选择:
__attribute__((packed))
或类似于编译器插入的控制填充(推荐,但要求您使用编译器扩展)在字节级别手动访问,而不使用结构。例如:
You basically have the following options:
__attribute__((packed))
or similar to control padding inserted by the compiler (recommended, but requires that you use compiler extensions)Manually access at the byte level, without using structs. Eg:
直接将二进制数据读入结构会导致灾难。这意味着您在未经验证的情况下对某些输入的结构做出假设;当然,您可以事后检查结构的完整性。但通常情况下,您必须对输入数据进行架构相关的调整。考虑低端字节序与大端字节序。不同的字长、打包规则等。
长话短说:不要陷入黑暗面,它是快速破解的诱人承诺。
读取文件的唯一正确方法是逐个八位组读取它;当然,您可以在缓冲区中读取更大的块,但是您应该通过查看每个位来处理它们。如果您担心性能,您应该阅读“计算机编程的艺术”的第 1 卷和第 4 卷中已发布的内容,其中深入解释了如何在不忽略任何数据的情况下有效地处理数据流。
或者使用 Google 的协议缓冲区。
Reading binary data directly into structures is a recipe for disaster. It means you're making assumptions about the structure of some input without verification; of course you could check the structure for integrity afterwards. But more often than not you'll have to do architecture dependent adjustments to the input data. Think low endian vs. big endian. Different word lengths, packing rules, etc.
To make a long story short: Don't fall for the dark side and it's seducing promise of quick hacks.
The only proper way to read a file is reading it octet by octet; you can read larger chunks in a buffer of course, but you should then process them by looking at each single bit. If you worry about performance you should read Volume 1 and what's been released to far of Volume 4 of "The Art of Computer Programming" which in depth explains how to process data streams efficiently without neglecting any data.
Or use Google's Protocol Buffers.