使用 fwrite/fread 和数据结构的几个问题
我第一次使用 fwrite()
和 fread()
将一些数据结构写入磁盘,我有几个关于最佳实践和正确方法的问题做事。
我写入磁盘的内容(以便稍后可以读回)是插入到图形结构中的所有用户配置文件。每个图形顶点都具有以下类型:
typedef struct sUserProfile {
char name[NAME_SZ];
char address[ADDRESS_SZ];
int socialNumber;
char password[PASSWORD_SZ];
HashTable *mailbox;
short msgCount;
} UserProfile;
这就是我当前将所有配置文件写入磁盘的方式:
void ioWriteNetworkState(SocialNetwork *social) {
Vertex *currPtr = social->usersNetwork->vertices;
UserProfile *user;
FILE *fp = fopen("save/profiles.dat", "w");
if(!fp) {
perror("fopen");
exit(EXIT_FAILURE);
}
fwrite(&(social->usersCount), sizeof(int), 1, fp);
while(currPtr) {
user = (UserProfile*)currPtr->value;
fwrite(&(user->socialNumber), sizeof(int), 1, fp);
fwrite(user->name, sizeof(char)*strlen(user->name), 1, fp);
fwrite(user->address, sizeof(char)*strlen(user->address), 1, fp);
fwrite(user->password, sizeof(char)*strlen(user->password), 1, fp);
fwrite(&(user->msgCount), sizeof(short), 1, fp);
break;
currPtr = currPtr->next;
}
fclose(fp);
}
注释:
- 您看到的第一个
fwrite()
将写入图表中的总用户数,以便我知道需要读回多少数据。 break
用于测试目的。有成千上万的用户,我仍在试验代码。
我的问题:
- 阅读后这我决定在每个元素上使用
fwrite()
而不是编写整个结构。我还避免将指针写入邮箱,因为我不需要保存该指针。那么,这是要走的路吗?整个结构使用多个fwrite()
而不是全局的?那不是更慢吗? - 我如何读回此内容?我知道我必须使用
fread()
但我不知道字符串的大小,因为我使用strlen()
来编写它们。我可以在写入字符串之前写入strlen()
的输出,但是有没有更好的方法而不需要额外的写入?
I'm using fwrite()
and fread()
for the first time to write some data structures to disk and I have a couple of questions about best practices and proper ways of doing things.
What I'm writing to disk (so I can later read it back) is all user profiles inserted in a Graph structure. Each graph vertex is of the following type:
typedef struct sUserProfile {
char name[NAME_SZ];
char address[ADDRESS_SZ];
int socialNumber;
char password[PASSWORD_SZ];
HashTable *mailbox;
short msgCount;
} UserProfile;
And this is how I'm currently writing all the profiles to disk:
void ioWriteNetworkState(SocialNetwork *social) {
Vertex *currPtr = social->usersNetwork->vertices;
UserProfile *user;
FILE *fp = fopen("save/profiles.dat", "w");
if(!fp) {
perror("fopen");
exit(EXIT_FAILURE);
}
fwrite(&(social->usersCount), sizeof(int), 1, fp);
while(currPtr) {
user = (UserProfile*)currPtr->value;
fwrite(&(user->socialNumber), sizeof(int), 1, fp);
fwrite(user->name, sizeof(char)*strlen(user->name), 1, fp);
fwrite(user->address, sizeof(char)*strlen(user->address), 1, fp);
fwrite(user->password, sizeof(char)*strlen(user->password), 1, fp);
fwrite(&(user->msgCount), sizeof(short), 1, fp);
break;
currPtr = currPtr->next;
}
fclose(fp);
}
Notes:
- The first
fwrite()
you see will write the total user count in the graph so I know how much data I need to read back. - The
break
is there for testing purposes. There's thousands of users and I'm still experimenting with the code.
My questions:
- After reading this I decided to use
fwrite()
on each element instead of writing the whole structure. I also avoid writing the pointer to to the mailbox as I don't need to save that pointer. So, is this the way to go? Multiplefwrite()
's instead of a global one for the whole structure? Isn't that slower? - How do I read back this content? I know I have to use
fread()
but I don't know the size of the strings, cause I usedstrlen()
to write them. I could write the output ofstrlen()
before writing the string, but is there any better way without extra writes?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您的程序需要完全可移植,那么您不应该将整数和短整型作为内存块写入磁盘:当您尝试在具有不同字大小(例如 32 位 -> ; 64 位)或不同的字节顺序。
对于字符串,您可以先写长度,也可以在末尾添加终止符。
最好的方法通常是使用基于文本的格式。例如,您可以将每条记录写为单独的行,并用制表符或冒号分隔字段。 (作为奖励,您不再需要在文件开头写入记录数 --- 只需读入记录,直到到达文件末尾。)
编辑:但是如果这是您布置的课堂作业,您可能不需要担心可移植性。将字符串中的
'\0'
终止符写入磁盘以分隔它们。读回时不要担心效率,最慢的是磁盘访问。或者甚至用
fwrite()
取出整个结构,然后用fread()
将其全部返回。担心该指针吗?读入时用安全值覆盖它。不必担心磁盘上的空间浪费(除非您被要求最大限度地减少磁盘使用)。如果您确实需要以可移植的二进制格式将非负整数写入磁盘,您可以这样做:
因此:
如果您还需要对负数进行编码,则需要在某处为符号保留一些位。
If your program needs to be at all portable then you should not be writing ints and shorts to disk as blocks of memory: the data will be corrupted when you try to read them in on a computer with a different word size (e.g. 32bit -> 64 bit) or different byte order.
For strings, you can either write the length first, or include a terminator at the end.
The best way is usually to use a text based format. For example, you could write each record as a separate line, with fields separated by a tab or a colon. (As a bonus, you no longer need to write a count of the number of records at the start of the file --- just read in records until you hit end of file.)
Edit: But if this is a class assignment you've been given, you probably don't need to worry about portability. Write
'\0'
terminators from the strings to disk to delimit them. Don't worry about efficiency when reading it back in, the slowest bit is the disk access.Or even
fwrite()
out the entire structure andfread()
it all back in. Worried about that pointer? Overwrite it with a safe value when you read it in. Don't worry about the wasted space on disk (unless you've been asked to minimise disk usage).If you do need to write non-negative ints to disk in a portable binary format, you could do it like this:
So:
If you need to encode negative numbers as well, you will need to reserve a bit for the sign somewhere.
你是对的:当你现在这样做时,没有办法读回内容,因为你无法分辨一个字符串在哪里结束,下一个字符串从哪里开始。
您引用的避免对结构化数据使用 fwrite() 的建议很好,但将该建议解释为您应该单独 fwrite() 每个元素可能不是最佳解决方案。
我认为您应该考虑为您的文件使用不同的格式,而不是使用 fwrite() 写入原始值。 (例如,您的文件将无法移植到具有不同字节顺序的机器。)
因为看起来您的大部分元素都是字符串和字符串。整数,您是否考虑过使用 fprintf() 进行写入和 fscanf() 进行读取的基于文本的格式?基于文本的格式而不是特定于应用程序的二进制格式的一大优点是您可以使用标准工具(用于调试等)查看它。
此外,无论您选择什么格式,请确保考虑您可能需要的可能性将来添加更多字段。至少,这意味着您应该在某种标头中包含版本号,无论是文件本身还是每个单独的条目。更好的是,标记各个字段(以允许可选属性),例如:
You're right: as you're doing it now, there's no way to read back the content because you can't tell where one string ends and the next begins.
The advice you cite to avoid using fwrite() for structured data is good, but interpreting that advice to mean that you should fwrite() each element individually may not be the best solution.
I think you should consider using a different format for your file, instead of writing raw values with fwrite(). (For example, your files will not be portable to a machine with different byte order.)
Since it looks like most of your elements are strings & integers, have you considered a text-based format using fprintf() to write and fscanf() to read? One big advantage of a text-based format instead of an application-specific binary format is that you can view it with standard tools (for debugging, etc.)
Also, whatever format you choose, make sure you consider the possibility that you may need to add more fields in the future. At a minimum, that means you should include a version number in some kind of header, either for the file itself or for each individual entry. Even better, tag the individual fields (to allow for optional attributes), for example:
它更慢。调用函数 x 次比调用一次(其中 x>1)慢。如果性能是一个问题,您可以将
fwrite
/fread
与 sizeof(struct) 结合使用以进行常规使用,并编写可移植的序列化版本以进行导入/导出。但首先检查一下是否真的有问题。大多数格式不再使用二进制数据,因此您可以看出至少fread
性能不是他们主要关心的问题。不,没有。另一种方法是执行基于
fgetc(3)
的 strlen。It is slower. Calling a function x times is slower than calling it once where x>1. If performance turns out to be a concern, you can use
fwrite
/fread
with sizeof(structure) for regular use and write a portable serialized version to import/export. But check if it really is a problem first. Most formats don't use binary data anymore, so you can tell that at leastfread
performance it's not their main concern.No there isn't. the alternative is doing a
fgetc(3)
based strlen.