发送到Kafka时如何拆分Spark DataFrame?
我正在使用以下语句将我的数据装置写给Kafka。
dataFrame.write.format("kafka")
.options(options).save()
不幸的是,上面的代码是将大量的数据帧写入KAFKA,作为单个消息,这导致了多个问题,我希望Kafka发送一个分隔为多个消息的单个数据框架,我也不想为每个记录发送MSG。我想发送作为块的块,我的kafka服务器应该轻松地接受,而无需更改服务器端的任何配置,请不要建议使用Rownum拆分,我可以自己做。发送到卡夫卡时,是否有Spark中有任何内置选项可以分开?
I am using the following statement to write my dataframe to kafka.
dataFrame.write.format("kafka")
.options(options).save()
Unfortunately above code is writing huge dataframe into kafka as single message, which is causing multiple issues, I want kafka to send a single data frame divided into multiple messages, also I dont want to send msg by msg for each records. I want to send as chunks which my kafka server should easily accepts without changing any configs in server side, please dont suggest to split using rownum, which I can do myself. Is there any builtin option in spark to divide while sending to kafka?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论