设置和 Caveat

发布于 2024-10-10 00:30:32 字数 3317 浏览 0 评论 0 收藏 0

我们将保持配置简单，并且运行一个 broker，使其用一个分区自动创建 topics。请注意，这是一个人为设置。任何生产部署将会是多个 broker，并且拥有更多的分区，但为了简单起见，我们将使用一个。

计划

启动过一个 Kafka 0.9 broker
每个客户端，我们将生产并消费百万个 100 字节的消息。
我们将需要一个 1.的生产者 acks ，这意味着只有 leader (反正我们只有一个 broker) 需要对消息进行 ack。增加这个将会确保你的数据不会由于 broker 的失败而丢失，当会减缓生产。

我使用以下版本：

pykafka 2.3.1
python-kafka 1.1.1
confluent-kafka-python 0.9.1

我在一个 MacBook Pro 2.2Ghz i7 上，使用 Vagrant 来运行这些测试。

Caveat

Also, the amount of file caching broker does really help the client consumption speed. To help combat this we will rerun each consumption test to compensate for caching recently accessed data.

像所有的基准测试一样，有保留的使用它。本地机器上的单个 broker 几乎不是一个生产部署。所有设置在很大程度上都保留它们的默认值。此外，文件缓存 broker 的数量确确实实帮助客户端消耗速度。为了帮助打击这一点，我们将重新开始每个消耗量试验，以弥补缓存最近访问的数据。

Even with all the normal stipulations, I hope you find this informative.

Installing clients can be complicated by the fact of some C extensions. We are big fans of conda and maintain python3 linux builds of some of these clients

here with recipes
here .

最简单的方式是使用 conda 安装客户端：

conda create -n kafka-benchmark python=3 ipython jupyter pandas seaborn -y
source activate kafka-benchmark
conda install -c activisiongamescience confluent-kafka pykafka -y # will also get librdkafka
pip install kafka-python # pure python version is easy to install

如果你想要运行这个 notebook，请在这里找到完整的仓库。

Included in this repo is a docker-compose file that will spin up a single kafka 0.9 broker and zookeeper instance locally. We can shell out and start it with docker compose.

!docker-compose up -d

Creating network "pythonkafkabenchmark_default" with the default driver
Creating pythonkafkabenchmark_zookeeper_1
Creating pythonkafkabenchmark_kafka_1

msg_count = 1000000
msg_size = 100
msg_payload = ('kafkatest' * 20).encode()[:msg_size]
print(msg_payload)
print(len(msg_payload))

b'kafkatestkafkatestkafkatestkafkatestkafkatestkafkatestkafkatestkafkatestkafkatestkafkatestkafkatestk'
100

bootstrap_servers = 'localhost:9092' # change if your brokers live else where

import time

producer_timings = {}
consumer_timings = {}

def calculate_thoughput(timing, n_messages=1000000, msg_size=100):
    print("Processed {0} messsages in {1:.2f} seconds".format(n_messages, timing))
    print("{0:.2f} MB/s".format((msg_size * n_messages) / timing / (1024*1024)))
    print("{0:.2f} Msgs/s".format(n_messages / timing))

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

列表为空，暂无数据

设置 和 Caveat

计划

Caveat

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

设置和 Caveat

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。