数据大小和磁盘访问

发布于 2024-12-07 06:30:55 字数 202 浏览 0 评论 0原文

将数据在存储上对齐到一定大小有好处吗?例如,如果我可以选择使用 1 个字节来存储信息或使用 4 个字节,哪一个是首选(假设存储大小不重要,仅优化)?

我问这个问题主要是因为我知道如果您获取内存中的值,它“很重要”(因此,例如,根据本网站上的另一个问题,.NET 布尔值是 4 个字节的原因)。

我认为这并不重要,但我正在使用 .NET 框架(特别是 C#)。

Is there a benefit to aligning data to a certain size on storage? For example, if I have the option to use one byte to store information or 4 bytes, which is preferred (assuming that storage size doesn't matter, only optimization)?

I ask this question mostly because I know that it "matters" if you're taking about in-memory values (and hence the reason why a .NET boolean is 4 bytes, for example, as per another question on this site).

I don't think it would matter, but I am using the .NET framework (C# specifically).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

善良天后 2024-12-14 06:30:55

如果您需要能够访问文件中的任何特定记录,则需要某种索引或固定的记录大小 - 但这是针对整个记录而不是记录的每个单独部分。我通常不会花太多精力在存储内的 4 或 8 字节边界上对齐数据。当然,如果您一次读取一条记录到内存中的对齐位置,那么您最终会得到对齐的数据来执行任何转换......因此它最终可能在某种程度上交织在一起 - 但转换是可能是一次性的,而不是转换后频繁访问。

当然,存储大小对于优化很重要 - 因为从磁盘读取较少的数据比读取更多的数据更便宜(通常......)。

除非您有固定记录大小等特定要求,否则我会尝试设计存储,使其尽可能易于使用。如果您对性能有特定的关注领域,则应该对这些领域进行分析。例如,使用 UTF-16 编码字符串可能比 UTF-8 更有效,因为编码和解码需要的工作量更少……尽管它会占用更多空间。不过,您应该测试这些而不是做出任何假设。请注意,您从何处加载存储格式将产生很大的差异 - 通过网络,从机械磁盘,从固态驱动器......这些将具有不同的性能特征,这可能会使设计一些东西变得困难对于所有情况都是最快的。

If you need to be able to get to any particular record in a file, you'll either need some sort of index or a fixed record size - but that's for the whole record rather than each individual part of the record. I wouldn't usually go to any great lengths to align data on 4 or 8 byte boundaries within storage. Of course, if you read a record at a time, to an aligned location in memory, then you end up with aligned data to perform any conversions on... so it can all end up being intertwined to some extent - but the conversion is likely to be one-off, rather than frequent access after conversion.

Storage size matters for optimization of course - because reading less data from the disk will be cheaper than reading more (generally...).

Unless you have specific requirements like fixed record sizes, I would just try to design the storage so that it's as easy to use as possible. If you have specific areas of concern for performance, you should profile those. For example, it may be more efficient to use UTF-16 to encode strings than UTF-8, as the encoding and decoding should require less work... even though it'll take more space. You should test these rather than making any assumptions though. Note that where you're loading the storage format from will make a big difference - over the network, from a mechanical disk, from a solid state drive... these will have different performance characteristics, which probably make it hard to design something which is fastest for all cases.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文