火花流应用程序不断重置kafka偏移量
我有一个火花流应用程序,该应用程序在带有4个节点的Spark群集上运行。几天前,该应用程序不断重置kafka偏移量,并且在设置自动偏移重置时不再获取kafka数据,
这是日志:
22/06/28 21:39:38 INFO AppInfoParser: Kafka version : 2.0.0
18|stream | 22/06/28 21:39:38 INFO AppInfoParser: Kafka commitId : 3402a8361b734732
18|stream | 22/06/28 21:39:39 INFO Metadata: Cluster ID: 3cAbAp6-QNyO1cKEc1dtUA
18|stream | 22/06/28 21:39:39 INFO AbstractCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Discovered group coordinator xxx.xxx.xxx.xxx:9092 (id: 2147483647 rack: null)
18|stream | 22/06/28 21:39:39 INFO ConsumerCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Revoking previously assigned partitions []
18|stream | 22/06/28 21:39:39 INFO AbstractCoordinator: [Consumer clientId=consumer-1, groupId=testid1] (Re-)joining group
18|stream | 22/06/28 21:39:39 INFO AbstractCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Successfully joined group with generation 9042
18|stream | 22/06/28 21:39:39 INFO ConsumerCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Setting newly assigned partitions [applog-15, applog-14, applog-13, applog-12, applog-11, applog-10, applog-9, new_apploglog-0, applog-8, applog-7, applog-6, applog-5, applog-4, applog-3, applog-2, applog-1, applog-0]
18|stream | 22/06/28 21:39:39 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16767946.
18|stream | 22/06/28 21:39:39 INFO RecurringTimer: Started timer for JobGenerator at time 1656452400000
18|stream | 22/06/28 21:39:39 INFO JobGenerator: Started JobGenerator at 1656452400000 ms
18|stream | 22/06/28 21:39:39 INFO JobScheduler: Started JobScheduler
18|stream | 22/06/28 21:39:39 INFO StreamingContext: StreamingContext started
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:46588) with ID 2
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:48860) with ID 3
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:35981 with 4.6 GB RAM, BlockManagerId(2, xxx.xxx.xxx.xxx, 35981, None)
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:40001 with 4.6 GB RAM, BlockManagerId(3, xxx.xxx.xxx.xxx, 40001, None)
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:39858) with ID 1
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:57696) with ID 0
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:44765 with 4.6 GB RAM, BlockManagerId(1, xxx.xxx.xxx.xxx, 44765, None)
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:46661 with 4.6 GB RAM, BlockManagerId(0, xxx.xxx.xxx.xxx, 46661, None)
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-15 to offset 285007408.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-14 to offset 285006512.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-13 to offset 285006673.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-12 to offset 285006392.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-11 to offset 285006399.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-10 to offset 285006961.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-9 to offset 285007334.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16838546.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-8 to offset 285007057.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-7 to offset 285005614.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-6 to offset 285007348.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-5 to offset 285004512.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-4 to offset 285005570.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-3 to offset 285008145.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-2 to offset 285007214.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-1 to offset 285007686.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-0 to offset 316632614.
18|stream | 22/06/28 21:40:00 INFO JobScheduler: Added jobs for time 1656452400000 ms
18|stream | 22/06/28 21:40:00 INFO JobScheduler: Starting job streaming job 1656452400000 ms.0 from job set of time 1656452400000 ms
18|stream | 22/06/28 21:40:00 INFO SparkContext: Starting job: collect at Main.scala:76
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Registering RDD 1 (repartition at Main.scala:69)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Got job 0 (collect at Main.scala:76) with 16 output partitions
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Final stage: ResultStage 1 (collect at Main.scala:76)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at repartition at Main.scala:69), which has no missing parents
18|stream | 22/06/28 21:40:00 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.9 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:00 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.1 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:00 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:41399 (size: 3.1 KB, free: 5.2 GB)
18|stream | 22/06/28 21:40:00 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Submitting 17 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at repartition at Main.scala:69) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
18|stream | 22/06/28 21:40:00 INFO TaskSchedulerImpl: Adding task set 0.0 with 17 tasks
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 0, xxx.xxx.xxx.xxx, executor 0, partition 1, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 1, xxx.xxx.xxx.xxx, executor 1, partition 8, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 2, xxx.xxx.xxx.xxx, executor 2, partition 3, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 3, xxx.xxx.xxx.xxx, executor 3, partition 0, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:44765 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:40001 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:35981 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:46661 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:02 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 4, xxx.xxx.xxx.xxx, executor 1, partition 10, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 5, xxx.xxx.xxx.xxx, executor 3, partition 4, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 1) in 2181 ms on xxx.xxx.xxx.xxx (executor 1) (1/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 3) in 2187 ms on xxx.xxx.xxx.xxx (executor 3) (2/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 6, xxx.xxx.xxx.xxx, executor 2, partition 7, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 2) in 2463 ms on xxx.xxx.xxx.xxx (executor 2) (3/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 7, xxx.xxx.xxx.xxx, executor 3, partition 5, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 5) in 343 ms on xxx.xxx.xxx.xxx (executor 3) (4/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 8, xxx.xxx.xxx.xxx, executor 1, partition 13, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 4) in 389 ms on xxx.xxx.xxx.xxx (executor 1) (5/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 9, xxx.xxx.xxx.xxx, executor 0, partition 2, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 0) in 2773 ms on xxx.xxx.xxx.xxx (executor 0) (6/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 10, xxx.xxx.xxx.xxx, executor 2, partition 11, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 6) in 403 ms on xxx.xxx.xxx.xxx (executor 2) (7/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 11, xxx.xxx.xxx.xxx, executor 3, partition 9, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 7) in 362 ms on xxx.xxx.xxx.xxx (executor 3) (8/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 12, xxx.xxx.xxx.xxx, executor 1, partition 15, PROCESS_LOCAL, 7746 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 8) in 369 ms on xxx.xxx.xxx.xxx (executor 1) (9/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 13, xxx.xxx.xxx.xxx, executor 1, partition 16, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 12) in 146 ms on xxx.xxx.xxx.xxx (executor 1) (10/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 11) in 247 ms on xxx.xxx.xxx.xxx (executor 3) (11/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 14, xxx.xxx.xxx.xxx, executor 0, partition 6, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 9) in 382 ms on xxx.xxx.xxx.xxx (executor 0) (12/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 15, xxx.xxx.xxx.xxx, executor 2, partition 14, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 10) in 337 ms on xxx.xxx.xxx.xxx (executor 2) (13/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 13) in 331 ms on xxx.xxx.xxx.xxx (executor 1) (14/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 16, xxx.xxx.xxx.xxx, executor 0, partition 12, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 14) in 303 ms on xxx.xxx.xxx.xxx (executor 0) (15/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 15) in 271 ms on xxx.xxx.xxx.xxx (executor 2) (16/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 16) in 291 ms on xxx.xxx.xxx.xxx (executor 0) (17/17)
18|stream | 22/06/28 21:40:04 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: ShuffleMapStage 0 (repartition at Main.scala:69) finished in 4.222 s
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: looking for newly runnable stages
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: running: Set()
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: waiting: Set(ResultStage 1)
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: failed: Set()
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[6] at map at Main.scala:73), which has no missing parents
18|stream | 22/06/28 21:40:04 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.8 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:04 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.7 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:41399 (size: 2.7 KB, free: 5.2 GB)
18|stream | 22/06/28 21:40:04 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1161
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: Submitting 16 missing tasks from ResultStage 1 (MapPartitionsRDD[6] at map at Main.scala:73) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
18|stream | 22/06/28 21:40:04 INFO TaskSchedulerImpl: Adding task set 1.0 with 16 tasks
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 17, xxx.xxx.xxx.xxx, executor 2, partition 0, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 18, xxx.xxx.xxx.xxx, executor 3, partition 1, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 19, xxx.xxx.xxx.xxx, executor 1, partition 2, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 20, xxx.xxx.xxx.xxx, executor 0, partition 3, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:44765 (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:40001 (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:46661 (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:48860
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:39858
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:46588
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:57696
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:41399 in memory (size: 3.1 KB, free: 5.2 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:44765 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:35981 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:46661 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:40001 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-15 to offset 285007532.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-14 to offset 285006636.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-13 to offset 285006799.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-12 to offset 285006518.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-11 to offset 285006525.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-10 to offset 285007087.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-9 to offset 285007459.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16838553.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-8 to offset 285007182.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-7 to offset 285005739.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-6 to offset 285007471.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-5 to offset 285004635.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-4 to offset 285005693.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-3 to offset 285008268.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-2 to offset 285007337.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-1 to offset 285007810.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-0 to offset 316632738.
18|stream | 22/06/28 21:42:00 INFO JobScheduler: Added jobs for time 1656452520000 ms
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-15 to offset 285007665.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-14 to offset 285006770.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-13 to offset 285006931.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-12 to offset 285006650.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-11 to offset 285006657.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-10 to offset 285007219.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-9 to offset 285007591.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16838556.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-8 to offset 285007314.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-7 to offset 285005871.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-6 to offset 285007603.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-5 to offset 285004767.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-4 to offset 285005825.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-3 to offset 285008400.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-2 to offset 285007469.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-1 to offset 285007942.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-0 to offset 316632870.
18|stream | 22/06/28 21:44:00 INFO JobScheduler: Added jobs for time 1656452640000 ms
重置kafka偏移量,即使没有错误,也会永远重复重复。
我采取了这些操作来解决问题,但没有任何帮助:
- 将kafka倒置到最早或最新的
- 删除消费者组,并创建新的问题,
- 我什至更改了主题,但没有改变,所以我猜这是来自Spark Cluster,但我可以在同一集群上加载带有pyspark shell的Kafka数据
:
该应用程序正常工作约3年!
最近我们进行了服务器迁移,我们的一些资源已减少
其他非流域作业在火花集群上运行而没有任何问题
我缺少什么吗?
I have a spark streaming app that runs on spark cluster with 4 node. A few days ago the app keeps resetting Kafka offset and does not fetch Kafka data anymore while the AUTO OFFSET RESET is set,
this is the log:
22/06/28 21:39:38 INFO AppInfoParser: Kafka version : 2.0.0
18|stream | 22/06/28 21:39:38 INFO AppInfoParser: Kafka commitId : 3402a8361b734732
18|stream | 22/06/28 21:39:39 INFO Metadata: Cluster ID: 3cAbAp6-QNyO1cKEc1dtUA
18|stream | 22/06/28 21:39:39 INFO AbstractCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Discovered group coordinator xxx.xxx.xxx.xxx:9092 (id: 2147483647 rack: null)
18|stream | 22/06/28 21:39:39 INFO ConsumerCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Revoking previously assigned partitions []
18|stream | 22/06/28 21:39:39 INFO AbstractCoordinator: [Consumer clientId=consumer-1, groupId=testid1] (Re-)joining group
18|stream | 22/06/28 21:39:39 INFO AbstractCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Successfully joined group with generation 9042
18|stream | 22/06/28 21:39:39 INFO ConsumerCoordinator: [Consumer clientId=consumer-1, groupId=testid1] Setting newly assigned partitions [applog-15, applog-14, applog-13, applog-12, applog-11, applog-10, applog-9, new_apploglog-0, applog-8, applog-7, applog-6, applog-5, applog-4, applog-3, applog-2, applog-1, applog-0]
18|stream | 22/06/28 21:39:39 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16767946.
18|stream | 22/06/28 21:39:39 INFO RecurringTimer: Started timer for JobGenerator at time 1656452400000
18|stream | 22/06/28 21:39:39 INFO JobGenerator: Started JobGenerator at 1656452400000 ms
18|stream | 22/06/28 21:39:39 INFO JobScheduler: Started JobScheduler
18|stream | 22/06/28 21:39:39 INFO StreamingContext: StreamingContext started
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:46588) with ID 2
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:48860) with ID 3
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:35981 with 4.6 GB RAM, BlockManagerId(2, xxx.xxx.xxx.xxx, 35981, None)
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:40001 with 4.6 GB RAM, BlockManagerId(3, xxx.xxx.xxx.xxx, 40001, None)
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:39858) with ID 1
18|stream | 22/06/28 21:39:40 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (xxx.xxx.xxx.xxx:57696) with ID 0
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:44765 with 4.6 GB RAM, BlockManagerId(1, xxx.xxx.xxx.xxx, 44765, None)
18|stream | 22/06/28 21:39:40 INFO BlockManagerMasterEndpoint: Registering block manager xxx.xxx.xxx.xxx:46661 with 4.6 GB RAM, BlockManagerId(0, xxx.xxx.xxx.xxx, 46661, None)
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-15 to offset 285007408.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-14 to offset 285006512.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-13 to offset 285006673.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-12 to offset 285006392.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-11 to offset 285006399.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-10 to offset 285006961.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-9 to offset 285007334.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16838546.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-8 to offset 285007057.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-7 to offset 285005614.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-6 to offset 285007348.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-5 to offset 285004512.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-4 to offset 285005570.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-3 to offset 285008145.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-2 to offset 285007214.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-1 to offset 285007686.
18|stream | 22/06/28 21:40:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-0 to offset 316632614.
18|stream | 22/06/28 21:40:00 INFO JobScheduler: Added jobs for time 1656452400000 ms
18|stream | 22/06/28 21:40:00 INFO JobScheduler: Starting job streaming job 1656452400000 ms.0 from job set of time 1656452400000 ms
18|stream | 22/06/28 21:40:00 INFO SparkContext: Starting job: collect at Main.scala:76
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Registering RDD 1 (repartition at Main.scala:69)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Got job 0 (collect at Main.scala:76) with 16 output partitions
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Final stage: ResultStage 1 (collect at Main.scala:76)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0)
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[1] at repartition at Main.scala:69), which has no missing parents
18|stream | 22/06/28 21:40:00 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.9 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:00 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.1 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:00 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:41399 (size: 3.1 KB, free: 5.2 GB)
18|stream | 22/06/28 21:40:00 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1161
18|stream | 22/06/28 21:40:00 INFO DAGScheduler: Submitting 17 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[1] at repartition at Main.scala:69) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
18|stream | 22/06/28 21:40:00 INFO TaskSchedulerImpl: Adding task set 0.0 with 17 tasks
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 0, xxx.xxx.xxx.xxx, executor 0, partition 1, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 1, xxx.xxx.xxx.xxx, executor 1, partition 8, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 2, xxx.xxx.xxx.xxx, executor 2, partition 3, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:00 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 3, xxx.xxx.xxx.xxx, executor 3, partition 0, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:44765 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:40001 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:35981 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:01 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx.xxx.xxx.xxx:46661 (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:02 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 4, xxx.xxx.xxx.xxx, executor 1, partition 10, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 5, xxx.xxx.xxx.xxx, executor 3, partition 4, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 1) in 2181 ms on xxx.xxx.xxx.xxx (executor 1) (1/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 3) in 2187 ms on xxx.xxx.xxx.xxx (executor 3) (2/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 6, xxx.xxx.xxx.xxx, executor 2, partition 7, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 2) in 2463 ms on xxx.xxx.xxx.xxx (executor 2) (3/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 7, xxx.xxx.xxx.xxx, executor 3, partition 5, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 5) in 343 ms on xxx.xxx.xxx.xxx (executor 3) (4/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 8, xxx.xxx.xxx.xxx, executor 1, partition 13, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 4) in 389 ms on xxx.xxx.xxx.xxx (executor 1) (5/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 9, xxx.xxx.xxx.xxx, executor 0, partition 2, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 0) in 2773 ms on xxx.xxx.xxx.xxx (executor 0) (6/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 10, xxx.xxx.xxx.xxx, executor 2, partition 11, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 6) in 403 ms on xxx.xxx.xxx.xxx (executor 2) (7/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 11, xxx.xxx.xxx.xxx, executor 3, partition 9, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 7) in 362 ms on xxx.xxx.xxx.xxx (executor 3) (8/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 15.0 in stage 0.0 (TID 12, xxx.xxx.xxx.xxx, executor 1, partition 15, PROCESS_LOCAL, 7746 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 8) in 369 ms on xxx.xxx.xxx.xxx (executor 1) (9/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 16.0 in stage 0.0 (TID 13, xxx.xxx.xxx.xxx, executor 1, partition 16, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 15.0 in stage 0.0 (TID 12) in 146 ms on xxx.xxx.xxx.xxx (executor 1) (10/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 11) in 247 ms on xxx.xxx.xxx.xxx (executor 3) (11/17)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 14, xxx.xxx.xxx.xxx, executor 0, partition 6, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:03 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 9) in 382 ms on xxx.xxx.xxx.xxx (executor 0) (12/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 15, xxx.xxx.xxx.xxx, executor 2, partition 14, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 10) in 337 ms on xxx.xxx.xxx.xxx (executor 2) (13/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 16.0 in stage 0.0 (TID 13) in 331 ms on xxx.xxx.xxx.xxx (executor 1) (14/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 16, xxx.xxx.xxx.xxx, executor 0, partition 12, PROCESS_LOCAL, 7748 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 14) in 303 ms on xxx.xxx.xxx.xxx (executor 0) (15/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 15) in 271 ms on xxx.xxx.xxx.xxx (executor 2) (16/17)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 16) in 291 ms on xxx.xxx.xxx.xxx (executor 0) (17/17)
18|stream | 22/06/28 21:40:04 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: ShuffleMapStage 0 (repartition at Main.scala:69) finished in 4.222 s
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: looking for newly runnable stages
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: running: Set()
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: waiting: Set(ResultStage 1)
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: failed: Set()
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[6] at map at Main.scala:73), which has no missing parents
18|stream | 22/06/28 21:40:04 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.8 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:04 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.7 KB, free 5.2 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:41399 (size: 2.7 KB, free: 5.2 GB)
18|stream | 22/06/28 21:40:04 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1161
18|stream | 22/06/28 21:40:04 INFO DAGScheduler: Submitting 16 missing tasks from ResultStage 1 (MapPartitionsRDD[6] at map at Main.scala:73) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14))
18|stream | 22/06/28 21:40:04 INFO TaskSchedulerImpl: Adding task set 1.0 with 16 tasks
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 17, xxx.xxx.xxx.xxx, executor 2, partition 0, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 18, xxx.xxx.xxx.xxx, executor 3, partition 1, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 19, xxx.xxx.xxx.xxx, executor 1, partition 2, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 20, xxx.xxx.xxx.xxx, executor 0, partition 3, NODE_LOCAL, 7942 bytes)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:44765 (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:40001 (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx.xxx.xxx.xxx:46661 (size: 2.7 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:48860
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:39858
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:46588
18|stream | 22/06/28 21:40:04 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to xxx.xxx.xxx.xxx:57696
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:41399 in memory (size: 3.1 KB, free: 5.2 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:44765 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:35981 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:46661 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:40:32 INFO BlockManagerInfo: Removed broadcast_0_piece0 on xxx.xxx.xxx.xxx:40001 in memory (size: 3.1 KB, free: 4.6 GB)
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-15 to offset 285007532.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-14 to offset 285006636.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-13 to offset 285006799.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-12 to offset 285006518.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-11 to offset 285006525.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-10 to offset 285007087.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-9 to offset 285007459.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16838553.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-8 to offset 285007182.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-7 to offset 285005739.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-6 to offset 285007471.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-5 to offset 285004635.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-4 to offset 285005693.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-3 to offset 285008268.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-2 to offset 285007337.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-1 to offset 285007810.
18|stream | 22/06/28 21:42:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-0 to offset 316632738.
18|stream | 22/06/28 21:42:00 INFO JobScheduler: Added jobs for time 1656452520000 ms
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-15 to offset 285007665.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-14 to offset 285006770.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-13 to offset 285006931.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-12 to offset 285006650.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-11 to offset 285006657.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-10 to offset 285007219.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-9 to offset 285007591.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition new_apploglog-0 to offset 16838556.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-8 to offset 285007314.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-7 to offset 285005871.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-6 to offset 285007603.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-5 to offset 285004767.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-4 to offset 285005825.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-3 to offset 285008400.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-2 to offset 285007469.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-1 to offset 285007942.
18|stream | 22/06/28 21:44:00 INFO Fetcher: [Consumer clientId=consumer-1, groupId=testid1] Resetting offset for partition applog-0 to offset 316632870.
18|stream | 22/06/28 21:44:00 INFO JobScheduler: Added jobs for time 1656452640000 ms
resetting Kafka offset repeats forever without even an error.
I did these actions to solve the problem but it did not any help:
- reset Kafka offset to earliest or latest
- delete consumer group and create new one
- I even changed the topic but nothing changed, so I guessed it was from spark cluster, but I can load Kafka data with Pyspark shell on the same cluster
notes:
the app was working OK about 3 years!
recently we had server migration and some of our resources has been decreased
other non-streaming jobs run on spark cluster without any issue
Is there anything that I'm missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您能在升级之前确认自动偏移重置中设置的值是什么?另外,您可以在执行以下命令的该特定主题上检查针对消费者组的偏移。
另外,检查最近是否有任何KAFKA升级WRT代理配置更改。
由于任何毒药或消费者行为的变化,这种行为也有边缘案例场景。因为,当auto.offset.reset属性有效地启动时,有很多因素。
从DOC中的一个情况下,
有一个边缘情况可能导致数据丢失,因此在可重试的例子中未重新传递消息。这种情况适用于一个尚未记录任何当前偏移(或已删除偏移)的新消费者组。
Can you confirm what was the value set in AUTO OFFSET RESET before the upgrade? Also you can inspect the offset against the consumer group on that particular topic executing the following command.
Also, check for any KAFKA Upgrade w.r.t broker configuration changes recently.
There are edge case scenarios also for this behaviour due to any poison pills or changes in the consumer behaviour. Because, there are many factors when auto.offset.reset property will kicks in efficiently.
One such case from the doc,
There is an edge case that could result in data loss, whereby a message is not redelivered in a retryable exception scenario. This scenario applies to a new consumer group that is yet to have recorded any current offset (or the offset has been deleted).
它来自我的Redis服务器。迁移后,它变得不稳定,有时它无法正常工作。因此,我重新启动了服务,一切都很好。
It was from my redis server. After migration it became unstable and sometimes it was not working correctly. so I restarted the service and everything worked fine.