架构 Azure 辅助角色以处理来自约 10 个队列的数据的最佳方法

发布于 2024-09-30 13:17:24 字数 792 浏览 3 评论 0原文

我有一个辅助角色，它将数据放入大约 10 个需要处理的队列中。有大量数据 - 可能每秒大约有 10-100 条消息在各个队列中排队。

队列保存不同的数据并单独处理它们。特别是有一个非常活跃的队列。

按照我现在的设置方式，我有一个单独的辅助角色，它生成 10 个不同的线程，每个线程执行一个具有 while(true){从队列中获取消息并处理它} 的方法。每当队列中的数据得到备份时，我们只需启动更多此类进程即可帮助加快队列中数据的处理速度。另外，由于一个队列更活跃，因此我实际上启动了许多指向同一方法的线程来处理该队列中的数据。

但是，我发现部署的 CPU 利用率很高。几乎始终保持或接近 100%。

我想知道这是否是因为线程饥饿？或者因为访问队列是 RESTful 的，并且线程最终会通过连接并减慢速度而相互阻塞？或者，是因为我使用：

while(true)
{
   var message = get message from queue;
   if(message != null)
   {
       //process message
   }
}

执行得太快了？

消息的每次处理也会将其保存到 Azure 表存储或数据库中 - 因此保存此数据的过程可能会消耗 CPU。

实际上，调试高 CPU 负载确实非常困难。所以，我的问题是：是否可以进行一般架构更改，以帮助缓解和防止可能出现的任何可能的问题？（例如，不要使用 while(true) 使用不同类型的轮询 - 尽管我认为该示例最终是相同的）。

也许简单地使用 new Thread() 生成新线程并不是最好的方法。

原文

I have one worker role that throws data into around 10 queues that need to be processed. There is a lot of data - probably around 10-100 messages a second that gets queued up in various queues.

The queues hold different data and process them separately. There is a single queue in particular that is very active.

The way I have it setup now, I a separate worker role that spawns 10 different threads, each thread executes a method that has a while(true){get message from queue and process it}. Whenever data in the queue gets backed up we simply launch more of these processes to help speed up the processing of the data from the queue. Also, since one queue is more active, I actually launch a number of threads pointing at the same method to process data from that queue.

However, I am seeing high CPU utilization of the deployment. Almost at or near 100% constantly.

I am wondering if this is because of thread starvation? Or because accessing the queue is RESTful and the threads end up blocking each other via doing the connection and slowing things down? Or, is it because I use:

while(true)
{
   var message = get message from queue;
   if(message != null)
   {
       //process message
   }
}

And that gets executed too fast?

Every processing of the message also saves it to the Azure Table Storage or the DB - so it might be the process of saving this data that is eating up the CPU.

In effect, it's been really hard to debug the high CPU load. So, my question is: are there general architecture changes that I can make that will help alleviate + prevent any possible issue that there might be? (e.g. instead of using while(true) using a different type of polling - although I'd imagine it's the same in the end for that example).

Maybe simply spawning new threads using new Thread() is not the best way to go.

分享到QQ

分享到微博