在 servlet 环境中处理批处理作业的线程

发布于 2024-12-05 02:40:08 字数 817 浏览 2 评论 0原文

我有一个 Spring-MVC、Hibernate、(Postgres 9 db) Web 应用程序。管理员用户可以发送请求来处理近 200,000 条记录（通过联接从各个表收集的每条记录）。每周或每月请求一次此类操作（或者当数据达到大约 200,000/100,000 条记录的限制时）。在数据库端，我正确地实现了批处理。

问题：如此长时间运行的请求会占用服务器线程，导致普通用户受到影响。
要求：此请求的较长响应时间不是问题。我们所需要的不是让其他用户因为这个耗时的过程而受苦。
我的解决方案：
使用 Spring taskExecutor 抽象实现线程池。所以我可以用 5 或 6 个线程初始化我的线程池，并将 200,000 条记录分成更小的块，比如每个大小 1000。我可以在这些块中排队。为了进一步允许普通用户更快地访问数据库，也许我可以让每个可运行线程休眠 2 或 3 秒。我认为这种方法的优点是：我们没有一次性执行巨大的数据库交互请求，而是跨越更长的时间进行异步设计。因此，其行为就像多个正常的用户请求。

请有经验的人对此发表一下意见吗？我还阅读了有关使用面向消息的中间件（如 JMS/AMQP 或 Quartz Scheduling）实现相同行为的内容。但坦率地说，我认为在内部他们也会做同样的事情，即创建一个线程池并在作业中排队。那么为什么不使用 Spring 任务执行器，而不是仅仅为了这个功能在我的 Web 应用程序中添加一个全新的基础设施呢？

请分享您对此的看法，并让我知道是否还有其他更好的方法可以做到这一点？再次强调：完全处理所有记录的时间并不重要，需要的是在此期间访问网络应用程序的普通用户不应该受到任何影响。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱人如己 2024-12-12 02:40:08

您可以并行执行任务并等待所有任务完成后再返回调用。为此，您需要使用 ExecutorCompletionService< /a> 自 5.0 起在 Java 标准中可用

简而言之，您使用容器的服务定位器来创建 ExecutorCompletionService 的实例，

ExecutorCompletionService<List<MyResult>> queue = new ExecutorCompletionService<List<MyResult>>(executor);

// do this in a loop
queue.submit(aCallable);

//after looping 
queue.take().get(); //take will block till all threads finish

如果您不想等待，则可以可以在后台处理作业而不阻塞当前线程，但是您将需要某种机制来通知客户端作业何时完成。这可以通过 JMS 实现，或者如果您有 ajax 客户端，那么它可以轮询更新。

Quartz也有作业调度机制，但是Java提供了标准的方式。

编辑：
我可能误解了这个问题。如果您不想要更快的响应，而是想要限制 CPU，请使用这种方法

您可以创建一个像 PollingThread 这样的内部类，其中每个作业包含 java.util.UUID 的批次以及 PollingThreads 的数量在外部定义班级。这将永远持续下去，并且可以进行调整以使您的 CPU 能够自由地处理其他请求

 class PollingThread implements Runnable {
            @SuppressWarnings("unchecked")
            public void run(){
                Thread.currentThread().setName("MyPollingThread");
                while (!Thread.interrupted()) {
                    try {
                        synchronized (incomingList) {
                            if (incomingList.size() == 0) {
                                // incoming is empty, wait for some time
                            } else {
                                //clear the original
                                list = (LinkedHashSet<UUID>) 
                                        incomingList.clone();
                                incomingList.clear();
                            }
                        }

                        if (list != null && list.size() > 0) {
                            processJobs(list);
                        }
                        // Sleep for some time
                        try {
                            Thread.sleep(seconds * 1000);
                        } catch (InterruptedException e) {
                            //ignore
                        }
                    } catch (Throwable e) {
                        //ignore                    
                    }
                }
           }
    }

You can parallelize the tasks and wait for all of them to finish before returning the call. For this, you want to use ExecutorCompletionService which is available in Java standard since 5.0

In short, you use your container's service locator to create an instance of ExecutorCompletionService

ExecutorCompletionService<List<MyResult>> queue = new ExecutorCompletionService<List<MyResult>>(executor);

// do this in a loop
queue.submit(aCallable);

//after looping 
queue.take().get(); //take will block till all threads finish

If you do not want to wait then, you can process the jobs in the background without blocking the current thread but then you will need some mechanism to inform the client when the job has finished. That can be through JMS or if you have an ajax client then, it can poll for updates.

Quartz also has a job scheduling mechanism but, Java provides a standard way.

EDIT:
I might have misunderstood the question. If you do not want a faster response but rather you want to throttle the CPU, use this approach

You can make an inner class like this PollingThread where batches containing java.util.UUID for each job and the number of PollingThreads are defined in the outer class. This will keep going forever and can be tuned to keep your CPUs free to handle other requests

 class PollingThread implements Runnable {
            @SuppressWarnings("unchecked")
            public void run(){
                Thread.currentThread().setName("MyPollingThread");
                while (!Thread.interrupted()) {
                    try {
                        synchronized (incomingList) {
                            if (incomingList.size() == 0) {
                                // incoming is empty, wait for some time
                            } else {
                                //clear the original
                                list = (LinkedHashSet<UUID>) 
                                        incomingList.clone();
                                incomingList.clear();
                            }
                        }

                        if (list != null && list.size() > 0) {
                            processJobs(list);
                        }
                        // Sleep for some time
                        try {
                            Thread.sleep(seconds * 1000);
                        } catch (InterruptedException e) {
                            //ignore
                        }
                    } catch (Throwable e) {
                        //ignore                    
                    }
                }
           }
    }

回复收藏 0 原文