缓存最佳实践:整体缓存数据与细粒度缓存数据

发布于 2024-11-02 23:02:14 字数 590 浏览 0 评论 0原文

在分布式缓存场景中,通常建议使用还是避免存储在缓存中的整体对象?

我正在使用由 EAV 模式支持的服务,因此我们将缓存设置到位,以最大限度地减少从数据库检索所有主记录和相应属性集合时 EAV 带来的感知性能缺陷。我们将在服务启动时填充缓存。

我们并没有特别频繁地调用所有产品——客户端在首次使用对象映射填充本地缓存后调用差异。为了执行该差异,分布式缓存将需要反映数据库中任意执行的单个记录的更改,并在客户端请求差异时处理更改。

首先想到的是使用列表或字典将记录存储在分布式缓存中——获取整个集合,在本地内存中操作或搜索它,将整个集合放回到缓存中。然而,后来的思考导致了用单独的记录填充缓存的想法,每个记录都以某种方式键入,使它们可以单独从缓存检索/更新到缓存。这导致人们想知道在更新所有数据时哪种方法性能更高。

我们使用的是 Windows Server AppFabric,因此我们可以使用 BulkGet 操作。不过,我不相信有任何批量更新的概念。

关于分布式缓存对象大小是否存在普遍的想法?如果我们对所有项目有更多请求,我会担心网络带宽,但至少现在,对所有项目的需求应该相当小。

是的,我们将测试和分析每种方法,但我想知道是否有任何超出当前思维范围的内容需要考虑。

In a distributed caching scenario, is it generally advised to use or avoid monolithic objects stored in cache?

I'm working with a service backed by an EAV schema, so we're putting caching in place to minimize the perceived performance deficit imposed by EAV when retrieving all primary records and respective attribute collections from the database. We will prime the cache on service startup.

We don't have particularly frequent calls for all products -- clients call for differentials after they first populate their local cache with the object map. In order to perform that differential, the distributed cache will will need to reflect changes to individual records in the database that are performed on an arbitrary basis, and be processed for changes as differentials are called for by clients.

First thought was to use a List or Dictionary to store the records in the distributed cache -- get the whole collection, manipulate or search it in-memory locally, put the whole collection back into the cache. Later thinking however led to the idea of populating the cache with individual records, each keyed in a way to make them individually retrievable from/updatable to the cache. This led to wondering which method would be more performant when it comes to updating all data.

We're using Windows Server AppFabric, so we have a BulkGet operation available to us. I don't believe there's any notion of a bulk update however.

Is there prevailing thinking as to distributed cache object size? If we had more requests for all items, I would have concerns about network bandwidth, but, for now at least, demand for all items should be fairly minimal.

And yes, we're going to test and profile each method, but I'm wondering if there's anything outside the current scope of thinking to consider here.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

夏日落 2024-11-09 23:02:14

因此,在我们的场景中,似乎整体缓存对象将是首选。通过数据中心中的大管道,大约 30 MB 的序列化产品数据几乎不需要任何明显的时间即可通过网络。使用 Dictionary 我们能够快速查找集合中的产品,以便返回或更新单个项目。

由于缓存中有数千个单独的实体(全部都小于 1 MB),批量操作需要的时间太长。网络操作的开销和延迟太多。

编辑:我们现在正在考虑维护实体和实体的整体集合,因为对于整体,使用生产数据集检索单个实体似乎成为一个相当昂贵的过程。

So in our scenario, it appears that monolithic cache objects are going to be preferred. With big fat pipes in the datacenter, it takes virtually no perceptible time for ~30 MB of serialized product data to cross the wire. Using a Dictionary<TKey, TValue> we are able to quickly find products in the collection in order to return, or update, the individual item.

With thousands of individual entities, all well under 1 MB, in the cache, bulk operations simply take too long. Too much overhead, latency in the network operations.

Edit: we're now considering maintaining both the entities and the monolithic collection of entities, because with the monolith, it appears that retrieving individual entities becomes a fairly expensive process with a production dataset.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文