如何从Python中的多个线程收集数据?

发布于 2024-11-01 03:07:13 字数 1388 浏览 0 评论 0原文

我想在 Python 中使用多个线程来计算图像的像素值,并在最后构建,尽管我无法弄清楚如何返回并收集线程的结果。设置如下:

创建了一个 Queue.Queue() 对象,以及一个 threading.Thread() 子类:

q = Queue.Queue()
class myThread(threading.Thread):
  def __init__(self, queue):
    self.queue = queue
    threading.Thread.__init__(self)
  def run(self):
    while True: # loop forever
      task = self.queue.get()
      rs = self.do_work(task) # I've got the result; now what to do with it?
      self.queue.task_done()

这个想法是我想收集像素数据对于 500x500 图像,最初是 250,000 (500x500) 个元素的列表,最终将使用 PIL 制作成图像:

pixels = array.array('B', pixels).tostring()
im = Image.fromstring('L', size, pixels)
im.show()

因此,我用每个像素的任务填充队列,并生成一个线程池:

for i in range(5):
  t = myThread(q)
  t.setDaemon(True)
  t.start()
for y in range(500):
  for x in range(500):
    q.put({'x':x, 'y':y})
q.join()

那么如何收集到所有数据了吗?我认为将 250,000 个元素列表传递给每个线程是一个坏主意,既考虑到传递的数据数组的大小,又考虑到每个线程都会丢失来自其他线程的数据。

编辑: 对于那些想知道是否值得以多线程方式执行此操作的人来说,计算图像坐标所做的工作是几个柏林噪声函数。它生成一个柏林 2D 噪声点阵列(5x5 网格),加上几个八度音阶(10x10、20x20 和 40x40 网格),并计算这些点之间的像素值。因此,对于最终图像中的每个像素,它必须对每个八度音程执行三次数学运算(给定点周围的 X 点平均,给定点周围的 Y 点平均,并对这些平均值进行平均),然后在八度音程结果之间进行加权平均。

在我的 8 核 Mac 上,我看到 Python 进程在运行时使用 1 个线程和 100% 的处理器。虽然我知道我有 8 个核心,并且看到进程显示 400-600% 的处理器使用率,表明它们正在利用其他核心,但我只是希望这个 Python 脚本也能做到这一点。

I'm wanting to use multiple threads in Python to calculate pixel values for an image, to be constructed at the end, though I'm having trouble figuring out how to get the result of the thread back and collected. Here's the setup:

A Queue.Queue() object is created, as well as a threading.Thread() class child:

q = Queue.Queue()
class myThread(threading.Thread):
  def __init__(self, queue):
    self.queue = queue
    threading.Thread.__init__(self)
  def run(self):
    while True: # loop forever
      task = self.queue.get()
      rs = self.do_work(task) # I've got the result; now what to do with it?
      self.queue.task_done()

The idea being that I want to collect pixel data for a 500x500 image, which is initially a list of 250,000 (500x500) elements, which will eventually be made into an image with the PIL:

pixels = array.array('B', pixels).tostring()
im = Image.fromstring('L', size, pixels)
im.show()

So I fill the queue with tasks for each pixel, and spawn a pool of threads:

for i in range(5):
  t = myThread(q)
  t.setDaemon(True)
  t.start()
for y in range(500):
  for x in range(500):
    q.put({'x':x, 'y':y})
q.join()

So how to get the data all collected? I think it would be a bad idea to pass the 250,000 element list to every thread, both for the size of that data array being passed, and since each thread would be missing the data from the other threads then.

EDIT:
For those of you wondering whether it's worth it at all to do this in a multi-threaded manner, the work that's being done to calculate the image coordinates are several perlin noise functions. It's generating a perlin 2D noise array of points (a 5x5 grid), plus several octaves (10x10, 20x20, and 40x40 grids), and calculating the pixel values between those points. So for each pixel in the final image, it has to do three math operations per octave (average X points around given point, average Y points around given point, and average those averages), and then do a weighted average between the octave results.

On my 8-core Mac, I see the Python process using 1 thread and 100% of the processor when running. Though I know I have 8 cores and have seen processes indicate 400-600% processor use to show they're taking advantage of other cores, and I was just hoping this Python script could do the same.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

飘然心甜 2024-11-08 03:07:13

Python 有一个用于修改 Python 级数据结构的全局锁,称为 GIL。这使得你很难有效地完成你想要用线程做的事情。

但是,不要绝望!开发人员为我们提供了多处理模块。将线程替换为多处理(改为使用 multiprocessing.Process 和 multiprocessing.Queue),瞧,您的应用程序是一个多进程应用程序。

至于你的问题,你想要另一个队列,朝另一个方向走。

Python has a global lock for modifying python-level datastructures, called the GIL. This makes it difficult to efficiently do the sort of things you want to do with threads.

But, despair not! The kind developers have given us the multiprocessing module. Replace threading with multiprocessing (to use multiprocessing.Process and multiprocessing.Queue instead) and voila, your application is a multiprocess application.

As to your question, you want to have another queue, going in the other direction.

﹎☆浅夏丿初晴 2024-11-08 03:07:13

我认为你应该使用两个队列。

一种用于工作/任务,一种用于输出。

任务完成后,将结果放入输出队列。

I think you should use two queues.

One for the jobs/tasks, one for the outputs.

Once a task is completed, put the result on the output queue.

习惯成性 2024-11-08 03:07:13

我会有一个每个线程都可以访问的全局列表。我实际上也遇到过这样的情况,并且这样做没有任何问题。

I would have a global list that can be accessed by every thread. I actually had a situation like that and did it that way without problems.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文