您的应用程序使用什么文件格式?为什么?
我最感兴趣的是针对大量变异的面向对象数据的进程内(单用户)解决方案,其中数据的任何部分都可能发生变化。 此类系统通常会遇到以下问题:
- 从头开始编写大文件效率低下
- xml 过于冗长
- SQL blob 不太匹配
那么如何做到这一点呢?
I'm most interested in in-process (single user) solutions for large amounts of mutating object-oriented data, where any part of the data may change. Such systems generally suffer from these problems:
- Writing large files out from scratch is inefficient
- xml is too verbose
- SQL blobs aren't a good match
So how do you do it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
或使用可用的几种开箱即用解决方案之一进行映射。
OR Mapping using one of the several out of the box solutions available.
这取决于您的要求。 您真的会使用 XML 或 SQL blob 来获取高分辨率图片或音频吗?
我再次阅读您的问题:如果您想将一堆任意对象存储在文件图像中,则将它们输入/输出的方法是复制和重新定位。 外拷贝可以得到GC的帮助。 副本内非常简单,主要取决于重定位例程。
如果需要处理非常大的文件,我会在该系统中提供一些方法来将对象标记为“脏”,并标记它们在文件图像中的实际位置。
还需要在已删除的对象中进行标记,除非您从不删除任何内容。
This depends about your requirements. Would you honestly use XML or SQL blobs for high resolution pictures or audio?
I Read again your question: If you have bunch of arbitrary objects you want to store in a file image, the way to get them in/out is the copying and relocation. The out-copy can get help from the GC. The in-copy is really straightforward and mainly depends on the relocation routine.
If there would be a requirement for working with very big files, I'd provide some method into that system to mark objects 'dirty', as well as marking where they actually lie in the file image.
There would be also the need to mark in removed objects, unless if you never remove anything.
我们主要使用二进制数据。 除非它必须是人类可读的(如设置和用户首选项)。
如果您认为 xml 太冗长,请查看 JSON。 我认为这是一个非常好的选择。
We use mostly binary data. Unless it has to be human readable (like settings and user preferences).
If you think xml is too verbose, have a look at JSON. I think it is a very good alternative.
“从头开始写大文件效率低下” 什么? 很少有东西比文件 I/O 更快。 请提供一些示例或数据来支持您的文件 I/O 效率低下的说法。
大多数面向对象系统可以将对象序列化或pickle到文件中。 这大约是最快的 I/O。
此外,大多数 OO 系统可以将对象转换为标准表示形式,例如 XML、JSON 或 YAML。
JSON/YAML 比 XML 更简洁且更容易解析。
"Writing large files out from scratch is inefficient" What? Few things are as fast as file I/O. Please provide some example or data to back up your assertion that file I/O is inefficient.
Most OO systems can serialize or pickle an object to a file. This is about the fastest I/O possible.
Also, most OO systems can convert objects to standard representations like XML or JSON or YAML.
JSON/YAML is less verbose and much easier to parse than XML.
我将 YAML 用于中小型文件,非常容易解析和保存。 JSON 是一个值得选择的选择。
I use YAML for small-to-medium files, very easy to parse and save. JSON is a worthy alternative.
您可以尝试序列化为 XAML,而不是 XML。
这可以创建更小的文件,并且读写(序列化/反序列化)速度更快。
显然,依赖于 XAML 作为一个选项。
You could try serializing to XAML, rather than XML.
This can create smaller files and is much faster to read and write (serialize/deserialize).
Obviously, dependent on XAML being an option.
您需要 O/R 映射或像 db4o 这样的对象数据库。
如果这是一组相对独立的对象的问题,也可以将每个对象存储到自己的文件中,并且仅在对象脏时才写入。 但显然,在更复杂的情况下,保持引用正确并避免不直观的目录结构可能需要做很多工作,而这确实是 O/R 映射器和对象数据库带来的好处。
至于 XML 过于冗长,通常可以通过压缩来解决(例如 zip 中的 xml)。
You need O/R mapping or an object database like db4o.
If it's a matter of a collection of relatively standalone objects, it's also possible to store each one to its own file and only write when the object is dirty. But obviously in more complex cases it can be a lot of work to keep the references straight, and to avoid unintuitve directory structures, and this is really what the O/R mappers and object dbs bring to the table.
As for XML being too verbose, that can often be solved with compression (e.g. xml in zip).
对于大型数据集,我使用结构化二进制文件,没有什么比空间和时间效率更高的了。
对于结构化文本数据,我将使用 s-表达式(即 LAML)或
减少用 i 表达式中的 as 实现的 LAML 括号。
For large datasets I use structured binary files, nothing is more space and time efficient.
For structured text data I would use s-expressions (I.e. LAML) or
to cut down on the parenthesis LAML implemented with as in i-expressions.