数组序列化性能问题

发布于 2024-09-25 08:45:15 字数 677 浏览 1 评论 0原文

在我的 Windows Mobile (.NET Compact Framework) 应用程序中,我使用一个大数组来存储应用程序的主要数据。这是一个可能包含数百个对象的数据集。每个对象都有大约 10 个左右的属性和它自己的两个数组,每个数组都有大约 25 个其他对象,每个对象都有大约 5 个属性。

为了将该数组保存在移动设备上,我只需序列化整个数组即可。这在大多数情况下都很有效,而且非常非常简单。

然而,在我们的测试用例中,我们总是只使用少数对象,最多大约 50 到 75 个。但是我们的客户也遇到过这样的情况,用户拥有数百个这样的对象,最多 1000 个。在这些情况下,序列化非常慢,最多可能需要一分钟。

实际的问题是,在保存整个数组时,大多数情况下只有几个对象实际发生了变化。所以基本流程是这样的:

  • 从存储中加载整个数组,比如说 400 个对象;
  • 更改 1 个对象的一些属性;
  • 将整个数组保存回存储,总共 400 个对象;
  • 更改同一个对象的更多属性;
  • 再次保存
  • 更改最终属性;
  • 再次保存;
  • 与任何后续对象相同...

如果保存不经常发生,通常不会有问题,但在几个中间步骤中会保存数据。这是为了确保所有数据都得到保留,并且不会发生数据丢失(例如,当电池耗尽时)。

我该如何解决这个问题?

In my Windows Mobile (.NET Compact Framework) application I use a big array to store the application's primary data. This is a dataset of potentially hundreds of objects. Each object has about 10 or so properties and two arrays of itself, each with about 25 other objects, each of which has about 5 properties.

To save this array on the mobile device, I simply serialize the entire array. This works well most of the time and it's very, very easy.

However in our test cases we always worked with only a handful of objects, a maximum of about 50 to 75. But our client has had cases where users had several hundreds of these objects, up to 1000. In those cases the serialization is quite slow, it can take up to a minute.

The actual problem is when saving the entire array, mostly only a couple of the objects have actually changed. So the basic flow is something like this:

  • Load the entire array from storage, say 400 objects;
  • Change a few properties of 1 object;
  • Save the entire array back to storage, the full 400 objects;
  • Change a couple more properties of that same object;
  • Save again
  • Change the final properties;
  • Save again;
  • Same with any subsequent objects...

It wouldn't be a problem normally if saving didn't occur too often, but on several intermediate steps the data is saved. This is to be sure all data is persisted and no data loss can occur (for example when the battery dies).

How can I solve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

过度放纵 2024-10-02 08:45:15

因此,为了清楚起见,请确保我理解您的场景:

  • 您拥有的是某种形式的序列化数组(您没有将格式指定为 XML、二进制或其他格式)作为您的数据存储?
  • 如果一个属性发生变化,即使有 1000 个带有子对象的对象,您也会重写整个数组?
  • 您正在写入闪存,而不仅仅是 RAM?
  • 为了完整的“保存”,您需要执行几次写入操作?
  • 由于某种原因,您发现这很慢并且数据集越大,速度就越慢?

答案实际上相当简单。根据您的操作方式,这完全是预期的行为。为什么要在数据存储中使用这种机制,尤其是对于大型且频繁更改的项目?这是一个糟糕的设计决策的典型例子。当属性更改时,您应该只更改存储中的该属性,而序列化数组不太适合这种情况。

您应该使用实际的数据库引擎,无论是 RDBMS 还是对象数据库,但要减少对存储介质的写入。如果您需要将数据作为数组传输到 PC/服务器,那很好 - 创建一种机制来从存储中提取数据并将其放入数组中。

So to be clear, le't make sure I understand your scenario:

  • What you have is some form of serialized array (you've not stated the format as XML, binary or other) as your data store?
  • And if one property changes, you rewrite the entire array, even if there are 1000 objects with sub-objects?
  • And you're writing to Flash, not just RAM?
  • And for a full "save" you're doign the write operation a few times?
  • And for some reason, you're finding that this is slow and it gets slower the larger the data set gets?

The answer is actually fairly simple. This is completely expected behavior based on how you're doing this. Why would you use this mechanism for a data store, especially for large, frequently changing items? This is a classic example of a poor design decision. When a property changes, you should be changing just that property in the store and serialized arrays simply are not well suited to this.

You should be using an actual database engine, whether it's an RDBMS or an object database, but something that's doing way, way less writing to the storage medium. If you need the data as an array for transfer to the PC/Server, that's fine - create a mechanism to extract from the store and put it into an array for that purpose.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文