PYTHON:多处理怪癖(或者:如何协调这些线程?)

发布于 2025-01-02 13:21:58 字数 943 浏览 1 评论 0原文

我受到了挑战。 我不确定如何在没有 jython 或 cython(或其他一些 IronPython Whatsahoosie)的情况下使用多处理,并且选择对我的多核 CentOS 程序使用线程。 它读取一组文本文件并输出到字典(由定义函数外部的 hfreq={} 定义)。如果我让它睡眠,它就会运行(非常慢,似乎在一个核心上)并且工作正常。

此外,我不知道如何让它等到两个线程完成后才实际输出到文件(除了 sleep.time 部分,这完全违背了速度的目的)

示例:

hfreq={}
[INSERT TEXT FILE ARRAYS HERE, RESPECTIVELY filenames0[] and filenames1[]]
def count():
    some code here that writes frequency to hfreq
def count1():
    some code here that writes frequency to hfreq as well, but using filenames1
t1=Thread(target=count,args())
t2=Thread(target=count1,args())
t1.start()
t2.start()
time.sleep(15) #No other known way to prevent the following from running immediately
list=hfreq.items()
list.sort()
Output=Open('Freq.txt', 'w')
[for statement that writes to file]
Output.close()

这就是它结束的地方。如果我在没有线程类(单独)的情况下运行该程序,它会提供大约 10-14 秒的运行时间。如果我尝试线程方法(将两个线程之间的非线程数组减半),两个线程都会运行 14 秒(而不是预期的多核使用)。 感谢您阅读这面文字墙。请告诉我是否可以澄清。

I have been challenged.
I am unsure how to use multiprocessing without jython or cython (or some other IronPython whatsahoosie), and have opted to use Threads for my multicore CentOS program.
It reads a set of text files and outputs to a dictionary (defined by hfreq={} on the outside of the defined functions). If I have it sleep, it runs (terribly slowly, seemingly on one core) and works fine.

Additionally, I do not know how to have it wait until both threads are done to actually output to file (other than the sleep.time part, which completely defeats the purpose of speed)

EXAMPLE:

hfreq={}
[INSERT TEXT FILE ARRAYS HERE, RESPECTIVELY filenames0[] and filenames1[]]
def count():
    some code here that writes frequency to hfreq
def count1():
    some code here that writes frequency to hfreq as well, but using filenames1
t1=Thread(target=count,args())
t2=Thread(target=count1,args())
t1.start()
t2.start()
time.sleep(15) #No other known way to prevent the following from running immediately
list=hfreq.items()
list.sort()
Output=Open('Freq.txt', 'w')
[for statement that writes to file]
Output.close()

And that is where it ends. If I run the program with no threading classes (on its own), it gives about 10-14 seconds of runtime. If I try the threading approach (halving the non-threading array between the two threads), I get BOTH THREADS running for 14 seconds (instead of the expected multi-core usage).
Thank you for reading this wall of text. Please tell me if I can clarify.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

知你几分 2025-01-09 13:21:58

如果您想利用 CPython 的多核优势,您应该使用 multiprocessing 模块:它有很多注意事项,但它相对比较适合解决这类问题。

要等待线程完成,请使用 t.join()。

If you want to take advantage of multiple cores with CPython, you should use the multiprocessing module: it has many caveats but this is the sort of problem it's a relatively good fit for.

To wait until a thread is done, use t.join().

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文