实体中拥有大量属性是否会影响数据存储读/写性能?
我有几个实体,其属性编号在 40 - 50 范围内。所有这些属性都没有索引。这些实体是较大实体组树结构的一部分,并且始终使用其密钥进行检索。没有任何属性(除了关键属性)都被索引。我正在使用 Objectify 处理 BigTable 上的实体。
我想知道从 BigTable 读取或写入具有大量属性的实体是否会对性能产生影响。
由于这些大型实体仅通过其键获取,从不参与任何查询,因此我想知道是否应该序列化实体 pojo 并存储为 blob。在 Objectify 中使用 @Serialized 注释来完成此操作非常简单。据我了解,通过序列化我的实体并将其存储为 blob,我可以使 blob 对任何其他程序或非 Java 代码完全不透明,但这不是问题。
我还没有对性能差异进行基准测试,但在这样做之前,我想知道是否有人以前这样做过或者有任何建议/意见可以分享。
I have couple of entities with properties numbering in the range of 40 - 50. All these properties are unindexed. These entities are a part of a larger entitygroup tree structure, and are always retrieved by using their key. None of the properties (except the key property) are indexed. I am using Objectify to work with entities on BigTable.
I want to know if there is any performance impact in reading or writing an entity with large number of properties from/to BigTable.
Since these large entities are only fetched by their keys are never participate in any query, I was wondering if I should serialize the entity pojo and store as a blob. It is pretty straightforward to do this in Objectify using the @Serialized annotation. I understand that by serializing my entity and storing it as a blob, I render the blob totally opaque to any other program or non-Java code, but this is not a concern.
I am yet to benchmark the performance difference, but before doing so, I want to know if anybody has done this before or has any advice/opinion to share.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
属性数量总是会产生开销。序列化并没有多大帮助,因为它只是将处理从一个点转移到另一点。
我的实体的属性数量最多为 25 个,我几乎根据所有请求通过键获取它们。性能差异对我来说可以忽略不计。几乎没有+/- 1ms。性能问题通常发生在查询部分。 未索引属性的数量对性能影响不大。而指数财产可能会因指数的修改而显着延迟看跌期权。
如果必须的话,如果您不打算立即需要它们,则可以将属性分解到多个表中。
there is always an overhead for number of properties. and serializing won't help much as it just moves processing from one point to another.
i have entities with number of property up to 25 and i fetch them almost on all request by key. the performance difference is negligible for me. hardly +- 1ms. performance problems normally occurs on query parts. number of unindexed property wont count much in performance. while indexed property can significantly delayed put due to modification of index.
if you must, you can break up property in to multiple table if you not going to need them at once.
纯粹根据我对其工作原理的了解,我想说拥有一堆未索引的属性与将整个事物序列化没有什么不同。
Going purely by what little I know of how it works, I'd say having a bunch of unindexed properties wouldn't be any different from having the whole thing serialized.