在服务实例之间划分内存数据

发布于 2025-01-15 16:07:51 字数 402 浏览 2 评论 0原文

最近在一次系统设计面试中,我被问到一个问题,城市被划分为区域,并且可以获得大约 100 个区域的数据。 API 将 zoneid 作为输入,并返回该区域的所有餐厅作为响应。 API 的响应时间为 50 毫秒,因此区域数据保存在内存中以避免延迟。

如果区域数据大约为 25GB,那么如果服务扩展到 5 个实例,则需要 125GB 内存。

现在的要求是运行 5 个实例,但仅使用 25 GB RAM,并在实例之间分配数据。

我相信要实现这一目标,我们需要第二个应用程序,它将充当配置管理器来管理哪个实例保存哪个区域数据。实例可以从配置管理器服务获取启动时要跟踪的区域。但我无法弄清楚的是,我们如何将对区域的请求重定向到保存其数据的正确实例,特别是如果我们使用 kubernetes。此外,如果保存部分数据的实例重新启动,那么我们如何跟踪它保存的区域数据

Recently in a system design interview I was asked a question where cities were divided into zones and data of around 100 zones was available. An api took the zoneid as input and returned all the restaurants for that zone in response. The response time for the api was 50ms so the zone data was kept in memory to avoid delays.

If the zone data is approximately 25GB, then if the service is scaled to say 5 instances, it would need 125GB ram.

Now the requirement is to run 5 instances but use only 25 GB ram with the data split between instances.

I believe to achieve this we would need a second application which would act as a config manager to manage which instance holds which zone data. The instances can get which zones to track on startup from the config manager service. But the thing I am not able to figure out is how we redirect the request for a zone to the correct instance which holds its data especially if we use kubernetes. Also if the instance holding partial data restarts then how do we track which zone data it was holding

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

倚栏听风 2025-01-22 16:07:51

将数据集拆分到多个节点:听起来像分片。

内存中:面试官可能会问有关redis或类似的问题。

也许是这样的: https://redis.io/topics/partitioning# Different-implementations -of-partitioning

Redis 集群可能适合——请记住,当文档提到“客户端分区”时:客户端是一些 Redis 客户端库,由后端加载,响应 HTTP客户端/最终用户请求


回答您的评论:然后,我不确定他们在寻找什么。

将 Java 哈希图与 Redis 集群进行比较并不完全公平,考虑到一个哈希图与您的 JVM 绑定,而另一个实际上是分布式/分片的,这意味着至少是进程间通信,并且最有可能是网络/非本地查询。

话又说回来,如果问题是扩展不断增长的 JVM:在某些时候,我们需要解决房间里的大象:如何保证数据一致性、正确的复制/分片、当成员出现故障时该怎么办, ...?

使用 Hazelcast 的分布式哈希图可能更相关。有些人(hazelcast)会认为它在重写入负载下更安全。其他人从 Hazelcast 迁移到 Redis 帮助他们提高了服务可靠性 。我自己没有足够的 Java 背景,我不知道。

作为一般规则:当被问及 Java 时,您可能会说速度和可靠性很大程度上取决于开发人员对他们正在做什么的理解。在 Java 中,这意味着很大的误差范围。虽然我们可以假设:如果他们问这样的问题,他们的工资单上可能有一些优秀的开发人员。

而分布式数据库(内存中、磁盘上、SQL 或 noSQL)...是一个相当复杂的主题,您需要掌握它(在 java 之上)才能正确使用。

Splitting dataset over several nodes: sounds like sharding.

In-memory: the interviewer might be asking about redis or something similar.

Maybe this: https://redis.io/topics/partitioning#different-implementations-of-partitioning

Redis cluster might fit -- keep in mind that when the docs mention "client-side partitioning": the client is some redis client library, loaded by your backends, responding to HTTP client/end-user requests


Answering your comment: then, I'm not sure what they were looking for.

Comparing Java hashmaps to a redis cluster isn't completely fair, considering one is bound to your JVM, while the other is actually distributed / sharded, implying at least inter-process communications and most likely network/non-local queries.

Then again, if the question is to scale an ever-growing JVM: at some point, we need to address the elephant in the room: how do you guarantee data consistency, proper replication/sharding, what do you do when a member goes down, ...?

Distributed hashmap, using Hazelcast, may be more relevant. Some (hazelcast) would make the argument it is safer under heavy write load. Others that migrating from Hazelcast to Redis helped them improve service reliability. I don't have enough background in Java myself, I wouldn't know.

As a general rule: when asked about Java, you could argue that speed and reliability very much rely on your developers understanding of what they're doing. Which, in Java, implies a large margin of error. While we could suppose: if they're asking such questions, they probably have some good devs on their payroll.

Whereas distributed databases (in-memory, on disk, SQL or noSQL), ... is quite a complicated topic, that you would need to master (on top of java), to get it right.

两个我 2025-01-22 16:07:51

Adya 在 2019 年将他们所描述的广泛方法描述为链接存储< /a>.链接的内存中键值存储允许支持丰富操作的应用程序对象在实例集群上进行分片。

我倾向于通过使用 Akka 实现有状态应用程序来解决这个问题(免责声明:我在撰写本文时受雇于 Lightbend,该公司雇用了 Akka 的大多数开发人员,并为使用 Akka 的客户提供支持和咨询服务;正如我的 SO 历史所示,甚至在我受雇于 Lightbend 之前的很多年,我都会采用相同的方法)沿着这些思路。

  • Akka 集群允许一组运行应用程序的 JVM 以对等方式形成集群,并管理/跟踪成员资格的更改(包括检测已崩溃或被网络分区隔离的实例)< /p>

  • Akka集群分片允许以 ID 为键的有状态对象在集群中大致均匀地分布,并根据成员资格更改进行重新平衡

这些有状态对象作为参与者实现:它们可以更新其状态以响应消息,并且(因为它们在处理消息时会更新它们的状态)一次),无需复杂的同步。

集群分片意味着负责 ID 的参与者可能存在于不同的实例上,因此这意味着集群外部区域状态的某种持久性。为简单起见*,当负责给定区域的 Actor 启动时,它会从数据存储(可以是 S3、Dynamo 或 Cassandra 或其他)初始化自身:此后其状态位于内存中,因此可以直接从 Actor 的状态提供读取服务而不是访问底层数据存储。

通过集群分片引导所有写入,根据定义,内存中表示与写入保持同步。在某种程度上,我们可以说应用程序就是缓存:支持数据存储的存在只是为了让缓存能够在操作问题中幸存下来(并且因为它只是为了响应数据存储所需的此类问题)为了读取,我们可以优化数据存储的写入与读取)。

集群分片依赖于无冲突复制数据类型 (CRDT) 将分片分配中的更改广播到集群的节点。例如,这允许任何实例处理任何分片的 HTTP 请求:它只是将请求的重要部分的表示作为消息转发到分片,分片会将其分发给正确的参与者。

从 Kubernetes 的角度来看,实例是无状态的:不需要 StatefulSet 或类似的东西。 Pod 可以查询 Kubernetes API 以查找其他 Pod 并尝试加入集群。

*:我有一个相当强的先验知识,即事件溯源将是一种更好的持久性方法,但我现在将其放在一边。

The broad approach they're describing was described by Adya in 2019 as a LInK store. Linked In-memory Key-value stores allow for application objects supporting rich operations to be sharded across a cluster of instances.

I would tend to approach this by implementing a stateful application using Akka (disclaimer: I am at this writing employed by Lightbend, which employs the majority of the developers of Akka and offers support and consulting services to clients using Akka; as my SO history indicates, I would have the same approach even multiple years before I was employed by Lightbend) along these lines.

  • Akka Cluster to allow a set of JVMs running an application to form a cluster in a peer-to-peer manner and manage/track changes in the membership (including detecting instances which have crashed or are isolated by a network partition)

  • Akka Cluster Sharding to allow stateful objects keyed by ID to be distributed approximately evenly across a cluster and rebalanced in response to membership changes

These stateful objects are implemented as actors: they can update their state in response to messages and (since they process messages one at a time) without needing elaborate synchronization.

Cluster sharding implies that the actor responsible for an ID might exist on different instances, so that implies some persistence of the state of the zone outside of the cluster. For simplicity*, when an actor responsible for a given zone starts, it initializes itself from datastore (could be S3, could be Dynamo or Cassandra or whatever): after this its state is in memory so reads can be served directly from the actor's state instead of going to an underlying datastore.

By directing all writes through cluster sharding, the in-memory representation is, by definition, kept in sync with the writes. To some extent, we can say that the application is the cache: the backing datastore only exists to allow the cache to survive operational issues (and because it's only in response to issues of that sort that the datastore needs to be read, we can optimize the data store for writes vs. reads).

Cluster sharding relies on a conflict-free replicated data type (CRDT) to broadcast changes in the shard allocation to the nodes of the cluster. This allows, for instance, any instance to handle an HTTP request for any shard: it simply forwards a representation of the important parts of the request as a message to the shard which will distribute it to the correct actor.

From Kubernetes' perspective, the instances are stateless: no StatefulSet or similar is needed. The pods can query the Kubernetes API to find the other pods and attempt to join the cluster.

*: I have a fairly strong prior that event sourcing would be a better persistence approach, but I'll set that aside for now.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文