我们可以在 Spring Batch Processor 中一起处理整个块吗

发布于 01-09 02:20 字数 1348 浏览 2 评论 0原文

我有一个场景，员工暂存表中有数百万条记录，我需要丰富该值并将其存储在员工最终表中。现在我正在使用块大小为 10,000 的块处理。

在我必须进行丰富的处理器中，我想收集该块的所有员工 ID，并对每个块执行 1 个 API 调用并丰富值。而不是百万次调用百万条记录。

我观察到我现在使用的扩展 RepositoryItemReader 的阅读器没有返回列表，因为我正在使用 JPA 我正在使用 RepositoryItemReader。因此，即使块大小为 10,000，每个项目也会进行 1 次处理。

我们可以从读者那里获取整个列表并对其进行处理吗？或者还有其他方法吗，因为我无法真正为每条记录拨打 1 次电话。

public class EmployeeStagingReader extends RepositoryItemReader<EmployeeStaging>{
     public EmployeeStagingReader(EmployeeStagingRepository repo){
       super();
       this.setRepository(repo);
       this.setMethodName("findAll");
       final Map<String,Sort.Direction> sorts = new HashMap<>();
       sorts.put("ID",Sort.Direction.ASC)
       this.setSort(sorts);
  }
}


public class EmployeeProcessor implements ItemProcessor<List<EmployeeStaging>, List<EmployeeFinal>> {
  
        //Want to Perform transformation  of stagingemployee list of records and return employeefinal list of records

    }
}


@Bean
public Step step1() {
    return this.stepBuilderFactory.get("step1")
                .<List<EmployeeStaging>, List<EmployeeFinal>>chunk(1000)
                .reader(EmployeeStagingReader())
                .processor(EmployeeProcessor())
                .writer(EmployeeFinalWriter())
                .build();
}

原文

I have a scenario where I have million records in employee staging table, I need to enrich the value and store it in employee final table. Now I am using chunk processing with chunk size of 10,000.

In processor where I have to do enriching I want to collect all employee-id for that chunk and do 1 API call per chunk and enrich value. Instead of million call for million records.

I am observing that reader I am using right now which is extending RepositoryItemReader is not returning List as I am using JPA I am using RepositoryItemReader.
So processing is happening 1 time per item even after chunk size as 10,000.

Can we get the whole List from reader and do processing on it?
Or is there any other approach, cause I cant really make 1 call per record.

public class EmployeeStagingReader extends RepositoryItemReader<EmployeeStaging>{
     public EmployeeStagingReader(EmployeeStagingRepository repo){
       super();
       this.setRepository(repo);
       this.setMethodName("findAll");
       final Map<String,Sort.Direction> sorts = new HashMap<>();
       sorts.put("ID",Sort.Direction.ASC)
       this.setSort(sorts);
  }
}


public class EmployeeProcessor implements ItemProcessor<List<EmployeeStaging>, List<EmployeeFinal>> {
  
        //Want to Perform transformation  of stagingemployee list of records and return employeefinal list of records

    }
}


@Bean
public Step step1() {
    return this.stepBuilderFactory.get("step1")
                .<List<EmployeeStaging>, List<EmployeeFinal>>chunk(1000)
                .reader(EmployeeStagingReader())
                .processor(EmployeeProcessor())
                .writer(EmployeeFinalWriter())
                .build();
}

分享到QQ

分享到微博