管理基于内存的数据格式的更改
因此,我一直在 C++ 中使用紧凑数据类型,从内存中保存或从文件中加载只涉及将内存位复制进和出。
然而,这样做的明显缺点是,如果您需要添加/删除数据上的元素,它会变得有点混乱。版本控制也存在问题,假设您分发一个使用数据版本 A 的程序,然后第二天您制作它的版本 B,然后制作版本 C。
我想这可以通过使用诸如 xml 或json。但假设由于技术原因您无法做到这一点。
除了必须制作不同的 if 情况等之外,最好的方法是什么(我想这会很丑陋)
So I've been using a compact data type in c++, and saving from memory or loading from the file involves just copying the bits of memory in and out.
However, the obvious drawback of this is that if you need to add/remove elements on the data, it becomes kind of messy. There's also problems with versioning, suppose you distribute a program which uses version A of the data, and then the next day you make version B of it, and then later on version C.
I suppose this can be solved by using something like xml or json. But suppose you can't do that for technical reasons.
What is the best way to do this, apart from having to make different if cases etc (which would be pretty ugly, I'd imagine)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我不知道你的“技术原因”是什么,但如果它们涉及速度或数据大小,那么我可能会建议 协议缓冲区作为您的解决方案。它明确设计用于处理版本控制。与简单地转储结构相比,它会稍慢一些,稍大一些,但也只是稍微大一些,而且它会更便携,并且可以更好地处理版本控制。
I don't know what your 'technical reasons' are, but if they involve speed or data size then I might suggest Protocol Buffers as your solution. It's explicitly designed to handle versioning. It will be slightly slower and slightly larger than simply dumping a struct, but only slightly, and it will be much more portable and handle versioning better.
来自 3dsmax 的一个想法(如果我没记错的话):将文件分成块,每个块都有一个描述它的标题(可能很长)和一个长度。阅读时,如果您不知道标题,则可以通过知道长度跳到下一个。该过程在每个块内递归应用,并确保向后兼容性。
An Idea that comes from 3dsmax ( if I remember well ): divide the file into chunks, each chunk has an header ( a long maybe ) describing it and a length. When reading if you do not know the header you skip to the next one by knowing the len. This process apply recursively inside each chunk, and ensures the back compatibility.
如果您采用“面向列”的方式,那么您可以根据需要添加字段。
原始结构与旧方式:
用旧方式添加字段:
新的和改进的方式:
添加到新方式:
要点是每个字段都是它自己的文件。额外的好处是用户只需要读/写他想要的字段,因此版本控制变得更容易。
If you go the "column-oriented" way, then you can add fields as you like.
original struct with old way:
adding field with old way:
new and improved way:
adding to the new way:
The gist is that each field is its own file. The added benefit is that a user only needs to read/write the fields he wants, so versioning becomes easier.
我们在工作中处理这个问题。这不是最好的,但您可以做一些事情:
向所有文件添加标头,第一个字段为“版本”,第二个字段为“长度”。在加载时,您现在可以适当地处理旧版本。
如果可以的话,制定规则“永远不要删除数据字段,始终在文件末尾添加字段”。如果您这样做,那么您的加载代码可以通过将可用数据读取到结构中来加载旧的、较短版本的文件,并保留最后一个字段(不在文件中)初始化。当您开始使用结构数组时,这会崩溃,此时您需要手动加载数据。
We deal with this at my work. It's not the best but some things you can do:
add a header to all files with the first field being "version" and the second field being "length". On load you can now deal with old versions appropriately.
if you can, make the rule "never delete data fields, always add fields at end of file". If you do this, then your loading code could load an old, shorter version file by just reading the available data into the struct and leave the last fields (that weren't in the file) initialized. This falls apart when you start having arrays of structs, at that point you need to load the data manually.
这就是我要破解的方法。这是一种“hacky”方法,但可以扩展为更复杂的方法。
要写入具有固定大小块的文件,算法将是这样的 -
并且读取 -
对于异构块,您将需要在读取时跟踪偏移量。
This is how I would hack at it. It's a "hacky" approach, but could be extended to be more sophisticated.
To write a file with fixed size blocks, the algorithm would be something like this -
and to read -
With heterogenous blocks, you will need to track the offsets as you read.