在 C++ 中处理许多进程的中央数据缓冲区

发布于 2024-08-08 10:50:08 字数 342 浏览 8 评论 0原文

我遇到了以下问题,无法决定如何继续:

我有一个类,Reader,每 1/T 秒获取一大块数据(实际上数据来自视频帧,每秒 30 帧)。这些块将被传递给多个对象,即处理这些块并输出决策的检测器。然而,每个检测器在做出决定之前需要读取的块数各不相同,例如有些可能只需要一个块,有些需要 51 个。

我正在考虑有一个数据缓冲区,其中 Reader 放置读取的数据块,实现发布/订阅者来注册每个检测器,并在数据缓冲区中有足够数量的帧供其处理时向其发送信号。这是一个好方法吗?另外,管理缓冲区并让 Detectors 从中读取数据而不制作自己的副本的最佳方法是什么?共享指针?

多谢!

C

I ran into the following problem and cannot decide how to proceed:

I have a class, Reader, getting a chunk of data every 1/T seconds (actually the data is from video frames, 30 frames per second). The chunks are to be passed to several objects, Detectors that process the chunks and output a decision. However, the number of chunks that each detector needs to read before making a decision varies, e.g. some may need only one chunk, some 51.

I am thinking of having a data buffer where Reader places the read data chunks, implementing publish/subscriber to register each Detector and sending it a signal when there are enough number of frames in the data buffer for it to process. Is this a good approach? Also, what's the best way to manage the buffer and have Detectors read data from it without making their own copies? Shared pointers?

Thanks a lot!

C

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

烟花易冷人易散 2024-08-15 10:50:08

我会研究环形缓冲区/循环队列。这将允许您仅通过一次性内存分配来完成您想要的操作(前提是您使初始缓冲区大小足够大以容纳最大必要数量的帧)。

至于管理对缓冲区的访问,数据准备好时发出信号并与读取器共享指针将起作用,但如果您使用多个线程,则需要某种类型的同步,请参见 生产者-消费者问题

I'd look into a ring buffer/circular queue. This will allow you to do what you want with only a one-time memory allocation (provided you make the initial buffer size large enough to hold the maximum necessary number of frames).

As for managing access to the buffer, signaling when data is ready and sharing pointers with the reader(s) will work, but if you're using multiple threads some type of synchronization will be necessary, c.f. the produced-consumer problem.

陌伤浅笑 2024-08-15 10:50:08

我认为(也基于您对 Maciek 的评论)您必须首先了解线程和进程之间的区别以及它们如何通信。

关于设计问题:
尝试从简单的设计开始。例如,仅使用线程并使用它自己的同步队列*将每个订阅者的shared_ptr传递给作业。由于对数据的访问是只读的,而且据我所知,boost::shared_ptr 对于这种用途是多线程安全的,因此不存在同步问题,并且数据会自动清理。暂时不用担心内存重新分配,只需确保每个订阅者/线程使用有限数量的内存 (o(1))(正如您所说,最多大约 51 个共享指针)。

当您拥有这个工作框架时,您将能够根据遇到的问题开始优化。如果重新分配是问题,您可以移至环形缓冲区(如 bcat 所建议)。或者您可以用池分配器替换分配器(/new 运算符)。如果您有许多订阅者,将队列合并为一个供所有线程使用的队列可能会很有效。这样做需要更多信息(如果一个线程由于计算很长而非常慢怎么办?您是否有某种方法通知它停止处理?或者队列应该增长?如果是这种情况,循环缓冲区可能不起作用好吧...)并且可能会有其复杂性,但请记住我们只是试图保存shared_ptrs(而不是作业)占用的空间。

最重要的是,尽量避免过早的优化。相反,在设计上进行合理的优化和可扩展性编写,然后根据您所学到的知识继续进行。

祝你好运

* 同步队列 - 线程之间的队列。 push(j) 添加作业,pop() 等待,直到队列不为空并返回顶部作业(与 stl::queue 不同。当队列被多个线程读取时,这一点很重要)。我通常通过包装 stl::queue 并使用 boost::mutex 保护它来实现它。

I think (also based on your comment to Maciek) you have to start by understanding the difference between threads and processes and how they can communicate.

Regarding the design problem:
Try to start with a simple design. for instance, using only threads and passing each of the subscribers a shared_ptr to the job using it's own synchronized queue*. Since the access to the data is read-only and, AFAICR, boost::shared_ptr is multi-threading safe for such a use, there are no synchronization problems and the data is cleaned automatically. Don't worry about memory realocations (yet), Just make sure you are using a finite amount of memory (o(1)) (as you said, about 51 shared_ptrs at most) per subscriber/thread.

When you'll have this working skeleton, you will be able to start optimizing based on the problems you encounter. If realocations are the problem, you can move to a ring buffer (as suggested by bcat). or you can replace your allocator (/new operator) with a pool allocator. if you have many subscribers, it might be effective to merge the queues into a single one used by all the threads. Doing that requires more information (what if one thread is very slow due to a very long computation? do you have some way to signal it to stop processing? or should the queue grow? if this is the case, a cyclic buffer may not work so well...) and may have its complications, but remember we are only trying to save the room occupied by the shared_ptrs (and not the jobs).

Bottom line, try to avoid premature optimizations. instead, write it with reasonable optimization and extendability in design and go on from there based on what you learn.

Good luck

* synchronized queue - a queue between threads. push(j) adds the job and pop() waits until the queue is not empty and returns the top job (unlike stl::queue. This is important when the queue is read by more than one thread). I usually implement it by wrapping an stl::queue and protecting it using boost::mutex.

我的影子我的梦 2024-08-15 10:50:08

我最近实现了与您所描述的类似的东西。

我强烈推荐 boost::interprocess 库(boost.org 了解更多信息)。

您正在寻找的是 boost::interprocess / Managed_shared_memory。一开始看起来会有点奇怪,但一旦你掌握了它的窍门 - 你就会喜欢它。

您要做的是:创建一个托管共享内存段。使用 void_allocator 分配一个将要处理进程间通信的对象(查找分配器)。实现同步机制(例如 boost::interprocess:semaphore 和 boost::interprocess_mutex)。通过托管共享内存实现来自单独进程的通信。

I've recently implemented something similiar to what you're describing.

I highly recommend the boost::interprocess library (boost.org for more information).

What you're looking for is boost::interprocess / managed_shared_memory. It's gonna look a bit weird at first but once you get the hang of it - you'll love it.

What you want to do is : Create a managed shared memory segment. Allocate an object that is going to handle interprocess communication using a void_allocator (look up allocators). Implement synchronisation mechanisms (boost::interprocess:semaphore & boost::interprocess_mutex for instance). Implement communication from separate processes via the managed shared memory.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文