datastream/table API的统一连接器

发布于 2025-01-21 01:32:11 字数 1106 浏览 3 评论 0原文

我正在为 Flink 1.14.4 编写一个简单的连接器(源/接收器),它主要包装官方的 Kafka 连接器并自动设置自定义序列化器/反序列化器。我对 FLIP-27 中引入的新源/接收器接口的当前状态有点困惑FLIP-143。目前是否真的可以编写统一的连接器(即跨不同 API 工作的连接器,例如 DataStream/Table)?通过查看当前 Kafka 连接器的代码,我发现它同时具有旧版和新版,但据我所知,Table API 的连接器仍然仅依赖于旧版 API。另外,通过阅读官方文档:

看来新接口仍然不能用于Table API。更糟糕的是,我发现在 DataStream 部分中仅提到源非常令人困惑,该部分已经描述了新方法:

但关于水槽什么也没说。总的来说,我认为这使得用户目前不太了解如何创建自定义连接器。特别是,我希望 DataStream API 有一个等效的部分,即涵盖用户定义源和数据源的创建的部分。接收器,如上面针对 Table API 给出的那样。

I am writing a simple connector (source/sink) for Flink 1.14.4 which mostly wraps the official Kafka Connector and automatically set ups custom serializers/deserializers. I'm a bit confused about the current state of the new source/sink interfaces introduced in FLIP-27 and FLIP-143. Is it currently possible to write unified connectors, really (that is, connectors that work across different APIs, such as DataStream/Table)? By looking at the code of the current Kafka Connector, I see it comes with both legacy and new flavours, but AFAIK the connector for the Table API still relies on the legacy API only. Also, by reading the official documentation:

It seems that the new interfaces cannot still be used for the Table API. To make it worse, I find it very confusing that only sources are mentioned in the DataStream section, which already describes the new approach:

but nothing is said regarding sinks. Overall, I think this leaves the user not knowing very well how to approach the creation of custom connectors as of today. In particular, I would expect having an equivalent section for the DataStream API, i.e., one covering the creation of user-defined sources & sinks, as that given above for the Table API.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

没有心的人 2025-01-28 01:32:11

创建了统一的源和接收器API(FLIP-27和FLIP-143)以创建一个连接器的接口,因此它们可用于有限(批次)和无绑定(流)数据。

这两个接口都允许构建可以在数据源或表/SQL API中使用的源/接收器。目前,Filesystem,Kafka和Pulsar(如Flink 1.15,它将很快发布)。

当前的文档并不清楚您是绝对正确的。目前,Flink社区正在努力将连接器外部化(将它们每个连接器从Flink存储库移到其个人存储库),并大修文档并指导如何编写连接器。

The Unified Source and Sink API (FLIP-27 and FLIP-143) were created to create one interface for connectors so they could be used for both bounded (batch) and unbounded (stream) data.

Both interfaces allow for building a source/sink that you can use in either DataStream or Table/SQL API. That's currently already the case for FileSystem, Kafka and Pulsar (as of Flink 1.15 which will be released shortly).

You're absolutely right that the current documentation doesn't make this clear. At the moment, the Flink community is working on externalizing the connectors (moving each of them from the Flink repository to their own individual repository) and overhauling the documentation and guides on how to write a connector.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文