PYTHON：多处理怪癖（或者：如何协调这些线程？）

发布于 2025-01-02 13:21:58 字数 943 浏览 1 评论 0原文

我受到了挑战。我不确定如何在没有 jython 或 cython（或其他一些 IronPython Whatsahoosie）的情况下使用多处理，并且选择对我的多核 CentOS 程序使用线程。它读取一组文本文件并输出到字典（由定义函数外部的 hfreq={} 定义）。如果我让它睡眠，它就会运行（非常慢，似乎在一个核心上）并且工作正常。

此外，我不知道如何让它等到两个线程完成后才实际输出到文件（除了 sleep.time 部分，这完全违背了速度的目的）

示例：

hfreq={}
[INSERT TEXT FILE ARRAYS HERE, RESPECTIVELY filenames0[] and filenames1[]]
def count():
    some code here that writes frequency to hfreq
def count1():
    some code here that writes frequency to hfreq as well, but using filenames1
t1=Thread(target=count,args())
t2=Thread(target=count1,args())
t1.start()
t2.start()
time.sleep(15) #No other known way to prevent the following from running immediately
list=hfreq.items()
list.sort()
Output=Open('Freq.txt', 'w')
[for statement that writes to file]
Output.close()

这就是它结束的地方。如果我在没有线程类（单独）的情况下运行该程序，它会提供大约 10-14 秒的运行时间。如果我尝试线程方法（将两个线程之间的非线程数组减半），两个线程都会运行 14 秒（而不是预期的多核使用）。感谢您阅读这面文字墙。请告诉我是否可以澄清。

原文

I have been challenged.
I am unsure how to use multiprocessing without jython or cython (or some other IronPython whatsahoosie), and have opted to use Threads for my multicore CentOS program.
It reads a set of text files and outputs to a dictionary (defined by hfreq={} on the outside of the defined functions). If I have it sleep, it runs (terribly slowly, seemingly on one core) and works fine.

Additionally, I do not know how to have it wait until both threads are done to actually output to file (other than the sleep.time part, which completely defeats the purpose of speed)

EXAMPLE:

hfreq={}
[INSERT TEXT FILE ARRAYS HERE, RESPECTIVELY filenames0[] and filenames1[]]
def count():
    some code here that writes frequency to hfreq
def count1():
    some code here that writes frequency to hfreq as well, but using filenames1
t1=Thread(target=count,args())
t2=Thread(target=count1,args())
t1.start()
t2.start()
time.sleep(15) #No other known way to prevent the following from running immediately
list=hfreq.items()
list.sort()
Output=Open('Freq.txt', 'w')
[for statement that writes to file]
Output.close()

And that is where it ends. If I run the program with no threading classes (on its own), it gives about 10-14 seconds of runtime. If I try the threading approach (halving the non-threading array between the two threads), I get BOTH THREADS running for 14 seconds (instead of the expected multi-core usage).
Thank you for reading this wall of text. Please tell me if I can clarify.

分享到QQ

分享到微博