以多线程方式处理项目列表的最简单方法
我有一段代码,可以打开数据读取器并为每条记录(包含 url)下载和下载数据。 处理该页面。
使其成为多线程的最简单方法是什么,例如,有 10 个插槽可用于同时下载和处理页面,并且当插槽可用时,正在读取下一行等。
我无法使用 WebClient .DownloadDataAsync
这是我尝试做的事情,但它没有起作用(即“worker”从未运行):
using (IDataReader dr = q.ExecuteReader())
{
ThreadPool.SetMaxThreads(10, 10);
int workerThreads = 0;
int completionPortThreads = 0;
while (dr.Read())
{
do
{
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
if (workerThreads == 0)
{
Thread.Sleep(100);
}
} while (workerThreads == 0);
Database.Log l = new Database.Log();
l.Load(dr);
ThreadPool.QueueUserWorkItem(delegate(object threadContext)
{
Database.Log log = threadContext as Database.Log;
Scraper scraper = new Scraper();
dc.Product p = scraper.GetProduct(log, log.Url, true);
ManualResetEvent done = new ManualResetEvent(false);
done.Set();
}, l);
}
}
I've got a piece of code that opens a data reader and for each record (which contains a url) downloads & processes that page.
What's the simplest way to make it multi-threaded so that, let's say, there are 10 slots which can be used to download and process pages in simultaneousy, and as slots become available next rows are being read etc.
I can't use WebClient.DownloadDataAsync
Here's what i have tried to do, but it hasn't worked (i.e. the "worker" is never ran):
using (IDataReader dr = q.ExecuteReader())
{
ThreadPool.SetMaxThreads(10, 10);
int workerThreads = 0;
int completionPortThreads = 0;
while (dr.Read())
{
do
{
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
if (workerThreads == 0)
{
Thread.Sleep(100);
}
} while (workerThreads == 0);
Database.Log l = new Database.Log();
l.Load(dr);
ThreadPool.QueueUserWorkItem(delegate(object threadContext)
{
Database.Log log = threadContext as Database.Log;
Scraper scraper = new Scraper();
dc.Product p = scraper.GetProduct(log, log.Url, true);
ManualResetEvent done = new ManualResetEvent(false);
done.Set();
}, l);
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您通常不需要使用最大线程(我相信它默认为每个进程 25 个工作线程,1000 个 IO)。 您可以考虑设置“最小线程数”,以确保始终有足够的可用线程数。
您也不需要调用 GetAvailableThreads。 您可以开始调用 QueueUserWorkItem 并让它完成所有工作。 您可以通过简单地调用 QueueUserWorkItem 来重现您的问题吗?
您还可以查看 并行任务库,它具有辅助方法使此类事情更易于管理和更容易。
You do not normally need to play with the Max threads (I believe it defaults to something like 25 per proc for worker, 1000 for IO). You might consider setting the Min threads to ensure you have a nice number always available.
You don't need to call GetAvailableThreads either. You can just start calling QueueUserWorkItem and let it do all the work. Can you repro your problem by simply calling QueueUserWorkItem?
You could also look into the Parallel Task Library, which has helper methods to make this kind of stuff more manageable and easier.