如何跟踪 python 脚本的内存

发布于 2024-12-04 00:32:25 字数 134 浏览 2 评论 0原文

我们的系统只有一名口译员。许多用户脚本都是通过这个解释器来实现的。我们希望对每个脚本的内存使用量设置上限。只有一个进程,该进程为每个脚本调用微线程。因此,由于我们只有一个解释器和一个进程,因此我们不知道一种方法来限制每个脚本的内存使用量。最好的方法是什么

We have a system that only has one interpreter. Many user scripts come through this interpreter. We want put a cap on each script's memory usage. There is only process, and that process invokes tasklets for each script. So since we only have one interpreter and one process, we don't know a way to put a cap on each scripts memory usage. What is the best way to do this

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

氛圍 2024-12-11 00:32:25

我认为这根本不可能。您的问题意味着您的小任务使用的内存是完全分离的,但情况可能并非如此。 Python 正在优化像整数这样的小对象。据我所知,例如代码中的每个 3 使用相同的对象,这不是问题,因为它是不可变的。因此,如果两个微线程使用相同的(小?)整数,则它们已经在共享内存。 ;-)

I don't think that it's possible at all. Your questions implies that the memory used by your tasklets is completly separated, which is probably not the case. Python is optimizing small objects like integers. As far as I know, for example each 3 in your code is using the same object, which is not a problem, because it is imutable. So if two of your tasklets use the same (small?) integer, they are already sharing memory. ;-)

染墨丶若流云 2024-12-11 00:32:25

内存在操作系统进程级别进行分离。没有简单的方法可以判断特定对象属于哪个 tasklet,甚至属于哪个线程。

此外,没有简单的方法来添加自定义簿记分配器来分析哪个微线程或线程正在分配一块内存并防止分配太多内存。它还需要插入垃圾收集代码以减少释放的对象。

除非您热衷于编写自定义 Python 解释器,否则最好的选择是为每个任务使用一个进程。

您甚至不需要在每次需要运行另一个脚本时终止并重新生成解释器。池化多个解释器,并且仅杀死运行脚本后超出特定内存阈值的解释器。如果需要,可以通过操作系统提供的方式限制解释器的内存消耗。

如果需要在任务之间共享大量公共数据,请使用共享内存;对于较小的交互,请使用套接字(根据需要具有高于它们的消息传递级别)。

是的,这可能比您当前的设置慢。但从您对 Python 的使用来看,我认为在这些脚本中您无论如何都不会进行任何时间关键的计算。

Memory is separated at OS process level. There's no easy way to tell to which tasklet and even to which thread does a particular object belong.

Also, there's no easy way to add a custom bookkeeping allocator that would analyze which tasklet or thread is is allocating a piece of memory and prevent from allocating too much. It would also need to plug into garbage-collection code to discount objects which are freed.

Unless you're keen to write a custom Python interpreter, using a process per task is your best bet.

You don't even need to kill and respawn the interpreters every time you need to run another script. Pool several interpreters and only kill the ones that overgrow a certain memory threshold after running a script. Limit interpreters' memory consumption by means provided by OS if you need.

If you need to share large amounts of common data between the tasks, use shared memory; for smaller interactions, use sockets (with a messaging level above them as needed).

Yes, this might be slower than your current setup. But from your use of Python I suppose that in these scripts you don't do any time-critical computing anyway.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文