Strimzi Kafka 使用本地节点存储

发布于 2025-01-11 05:23:09 字数 1162 浏览 0 评论 0原文

我正在 kubernetes 上运行 kafka（部署在 Azure 上），使用 strimzi 作为开发环境，并且更喜欢使用内部 kubernetes 节点存储。如果我使用 persistant-claim 或 jbod，它会在 azure 存储上创建标准磁盘。不过，我更喜欢使用内部节点存储，因为那里有 16 GB 可用空间。我不想使用临时的，因为我希望数据至少保留在 kubernetes 节点上。以下是我的deployment.yml

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: kafka-cluster
spec:
  kafka:
    version: 3.1.0
    replicas: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
      - name: external
        type: loadbalancer
        tls: false
        port: 9094
      
    config:
      offsets.topic.replication.factor: 2
      transaction.state.log.replication.factor: 2
      transaction.state.log.min.isr: 2
      default.replication.factor: 2
      min.insync.replicas: 2
      inter.broker.protocol.version: "3.1"
    storage:
      type: persistent-claim
      size : 2Gi
      deleteClaim: false
  zookeeper:
    replicas: 2
    storage:
      type: persistent-claim
      size: 2Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}
    userOperator: {}

原文

i am running kafka on kubernetes (deployed on Azure) using strimzi for development environment and would prefer to use internal kubernetes node storage. if i use persistant-claim or jbod, it creates standard disks on azure storage. however i prefer to use internal node storage as i have 16 gb available there. i do not want to use ephemeral as i want the data to be persisted atleast on kubernetes nodes.
folllowing is my deployment.yml

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  name: kafka-cluster
spec:
  kafka:
    version: 3.1.0
    replicas: 2
    listeners:
      - name: plain
        port: 9092
        type: internal
        tls: false
      - name: tls
        port: 9093
        type: internal
        tls: true
      - name: external
        type: loadbalancer
        tls: false
        port: 9094
      
    config:
      offsets.topic.replication.factor: 2
      transaction.state.log.replication.factor: 2
      transaction.state.log.min.isr: 2
      default.replication.factor: 2
      min.insync.replicas: 2
      inter.broker.protocol.version: "3.1"
    storage:
      type: persistent-claim
      size : 2Gi
      deleteClaim: false
  zookeeper:
    replicas: 2
    storage:
      type: persistent-claim
      size: 2Gi
      deleteClaim: false
  entityOperator:
    topicOperator: {}
    userOperator: {}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

云柯 2025-01-18 05:23:09

您使用的persistent-claim存储将使用默认存储类来配置存储，在您的情况下我猜会创建标准存储。

您有两种选择如何使用工作节点的本地磁盘空间：

您可以使用临时类型存储。但请记住，这就像一个临时目录，它会在每次滚动更新中丢失。此外，如果您同时删除所有 Pod，您将丢失所有数据。因此，它仅推荐用于 CI 中的一些短期集群，也许是一些短期开发等。但肯定不适合任何需要可靠性的地方。
您可以使用本地持久卷，它们是绑定到特定节点的持久卷。这些是持久性的，因此 Pod 将在重新启动和滚动更新之间重复使用卷。但是，它将 Pod 绑定到存储所在的特定工作节点 ->因此您无法轻松地将其重新安排到另一个工作节点。但除了这些限制之外，如果做得正确的话，它可以（与临时存储不同）具有可靠性和可用性。本地持久卷通常也通过 StorageClass 进行配置 ->因此，在 Strimzi 中的 Kafka 自定义资源中，它仍将使用 persist-claim 类型存储，只是存储类别不同。

您应该真正了解您到底想使用什么以及为什么。时，本地持久卷是一个不错的选择，

根据我的经验，当您在裸机/本地集群上运行
而在这些集群中，通常无法获得良好的共享块存储。当您需要最大性能时（本地存储不依赖于网络，因此通常会但在对高质量网络块存储（例如 Amazon EBS 卷及其 Azure 或 Google 对应项）

提供良好支持的公共云中，本地存储通常带来的问题多于优势，因为它将 Kafka 代理绑定到特定的工作节点。

有关本地持久卷的更多详细信息可以在此处找到：https://kubernetes。 io/docs/concepts/storage/volumes/#local ...还有不同的配置程序可以帮助您使用它。我不确定 Azure 是否支持开箱即用的任何功能。

旁注：2Gi 的空间对于 Kafka 来说非常小。不确定在磁盘空间耗尽之前您能够执行多少操作。即使 16Gi 也相当小了。如果你知道自己在做什么，那就很好。但如果没有，你应该小心。

The persistent-claim storage as you use it will provision the storage using the default storage class which in your case I guess creates standard storage.

You have two options how to use local disk space of the worker node:

You can use the ephemeral type storage. But keep in mind that this is like a temporary directory, it will be lost in every rolling update. Also if you for example delete all the pods at the same time, you will loose all data. As such it is something recommended only for some short-lived clusters in CI, maybe some short development etc. But for sure not for anything where you need reliability.
You can use Local Persistent Volumes which are persistent volumes which are bound to a particular node. These are persistent, so the pods will re-use the volume between restarts and rolling udpates. However, it bounds the pod to the particular worker node the storage is on -> so you cannot easily reschedule it to another worker node. But apart from these limitation, it is something what can be (unlike the ephemeral storage) used with reliability and availability when done right. The local persistent volumes are normally provisioned through StorageClass as well -> so in the Kafka custom resource in Strimzi it will still use the persistent-claim type storage, just with different storage class.

You should really thing what exactly you want to use and why. From my experience, the local persistent volumes are great option when

You run on bare metal / on-premise clusters where often good shared block storage is not available
When you require maximum performance (local storage does not depend on network, so it can be often faster)

But in public clouds with good support for high quality for networked block storage such as Amazon EBS volumes and their Azure or Google counterparts, local storage often brings more problems than advantages because of how it bounds your Kafka brokers to a particular worker node.

Some more details about the local persistent volumes can be found here: https://kubernetes.io/docs/concepts/storage/volumes/#local ... there are also different provisioners which can help you use it. I'm not sure if Azure supports anything out of the box.

Sidenote: 2Gi of space is very small for Kafka. Not sure how much you will be able to do before running out of disk space. Even 16Gi would be quite small. If you know what are you doing, then fine. But if not, you should be careful.

回复收藏 0 原文

~没有更多了~