使用 AsReference 限定符的性能损失有多严重(如果有的话)?

发布于 2024-11-26 11:45:21 字数 211 浏览 0 评论 0原文

我必须决定是否要通过 AsReference 对房产进行资格认证,从 550K 总额中削减额外的 5K。毕竟,5K 只是总数的一小部分——不到 1%。不过,如果性能损失很小——为什么不呢?

谢谢。

澄清

如果确实存在共享引用,那么使用 AsReference 确实可以减小大小。我的问题是关于性能,或者直白地说,是关于速度。

I have to decide whether I want to shave off extra 5K out of 550K total by qualifying a property with AsReference. After all, 5K is only a fraction of the total - less than 1%. Still, if the performance penalty is miniscule - why not?

Thanks.

Clarification

Using AsReference really reduces the size if there are actually shared references. My question is about the performance or bluntly put - the speed.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

执妄 2024-12-03 11:45:21

显然这取决于模型,序列化和反序列化在这里会有所不同。对于中等大小的模型,性能开销将是最小的,当然,它通常需要执行较少的实际序列化(假设有合理数量的标记为AsReference的重复对象实例) ;如果根本没有,那么开销虽然最小,但却被浪费了)。如果引用意味着我们可以避免重新序列化大型数据分支(可能是子集合等),那么我们可以节省 CPU 和带宽。

这里的任何成本纯粹是通过序列化来感受的,因为有问题的部分是检查我们之前是否见过该对象。在反序列化期间,它只是按索引从列表中提取项目,速度非常快。

另请注意,我假设此处 DynamicType 已禁用,因为这是一个单独的问题(同样,影响已降至最低)。

重新存储;当前维护一个平面列表,并检查引用是否相等。我想使用哈希表/字典查找,但我担心覆盖 GetHashCode()/Equals 的类型,遗憾的是无法访问原始数据object.GetHashCode() 实例方法。这意味着,对于 非常多 标记为 AsReference 的成员(这里我指的是图中的数千个对象),它可能会慢慢降级(查找将是 O( N) 用于长度为 N) 的不断增长的列表。将其更改为哈希查找将使每次查找的时间复杂度为 O(1)。

大声思考,当我们可以证明类型不会重写时,我们可能可以做一些事情(尽管这涉及更多的反射,这本身就是一种痛苦),或者我们可以只需相信用户不会搞乱 GetHashCode() 等 - 并使用他们的相等定义来表示图中的相等。我对此持开放态度,但目前引用平等被用作最安全和最简单的选择。

对于实际数字:这很大程度上取决于您的型号和尺寸;因为您有一个方便的模型并且知道有/没有 AsReference 的尺寸,所以您可能可以将其包装在 Stopwatch 或类似的东西中(最好是像 AsReference code>MemoryStream 因此您不会在计时中包括磁盘/IO 成本)。

It will depend on the model obviously, and serialization and deserialization will be different here. For moderately sized models the performance overhead will be minimal, except of course it will typically have less actual serialization to do (assuming there is a reasonable amount of repeated object instances marked AsReference; if there are none at all then the overhead, though minmimal, is wasted). And if the reference means we avoid re-serializing a large branch of data (maybe a sub-collection etc) then we can get some very nice savings for both CPU and bandwidth.

Any cost here is felt purely by serialization, since the problematic part is checking whether we have seen the object before. During deserialization it is just pluckign items from a list by index, so very fast.

Also note that I'm assuming DynamicType is disabled here, as that is a separate concern (again, impact is minimsed).

Re storage; currently a flat list is maintained, and checked for referential equality. I would like to have used a hashtable/dictionary lookup, but I have concerns about types that override GetHashCode()/Equals, and sadly it is not possible to access the original object.GetHashCode() instance method. This means that for very large numbers of members marked AsReference (and here I mean many many thousands of objects in the graph) it may slowly degrade (the lookup will be O(N) for a growing list of length N). Changing this to a hash lookup would make it O(1) at each lookup.

Thinking aloud, we could possibly do something when we can prove that the type doesn't override (although this involves more reflection, which is itself a pain), or we could just trust the user not to make a mess of GetHashCode() etc - and use their definition of equality to mean equality in the graph. I'm open to thoughts here, but currently referential equality is used as the safest and simplest option.

For actual numbers: it depends a lot on your model and size; since you have a handy model and know the size with/without AsReference, you are presumably in a good position to wrap that in a Stopwatch or similar (preferably to something like MemoryStream so you aren't including disk/IO costs in the timing).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文