我在此处按照一个示例: https://github.com/apache/flink/flink/blob/master/master/master/flink-streaming-java/src/src/main/java/java/java/org/org/org/org/apache/flink/streaming一下/pi/functions/source/Source/statefulSequencesource.java
我正在尝试使用jdbc连接来构建一个源,该连接扩展了RichParallEffunction并实现了CheckPointEdfunction,因为我希望能够从我的源表中保存我的水印重新启动。
在使用Docker本地测试时,我可以调用我的Run()方法,并从源数据库中读取数据,但是我不确定SnapShotState和Initializestate方法实际上是在哪里被调用的。我在那些应该基于第一个启动/恢复的方法设置我的水印值的逻辑 - 我只是从未看到访问它,并且不确定我是否应该在外部调用该方法?
感谢提前的帮助!
I am following an example here: https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/StatefulSequenceSource.java
I am trying to build a source using a jdbc connection which extends RichParallelFunction and implements CheckpointedFunction, as I would like to be able to save my watermark from my source tables in the case of restart.
When testing locally with docker, I can call my run() method just fine and read data from my source database, but I am not sure where the snapshotState and initializeState methods actually get called. I have logic in those methods that should be setting the value of my watermark based on first startup/recovery - I just never see that being accessed, and not sure if I should be calling the methods externally?
Thanks for the help in advance!
发布评论
评论(1)
当需要(执行检查点或保存点时)时,该方法将由Flink框架调用。
参见 和
The methods will be called by the Flink framework when it needs to (when performing a checkpoint or a save point).
See https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/datastream/fault-tolerance/state/#using-operator-state and https://nightlies.apache.org/flink/flink-docs-stable/api/java/org/apache/flink/streaming/api/checkpoint/CheckpointedFunction.html