OpenCV：寻找 CPU 密集度较低的帧捕获+调整大小并进入缓冲区的方式：如何优化我的代码？

发布于 2024-10-02 13:54:58 字数 2135 浏览 0 评论 0原文

所以我创建了一个函数（C++），

void CaptureFrame(char* buffer, int w, int h, int bytespan)
{
 /* get a frame */
 if(!cvGrabFrame(capture)){              // capture a frame 
  printf("Could not grab a frame\n\7");
  //exit(0);
 }
 CVframe =cvRetrieveFrame(capture);           // retrieve the captured frame

 /* always check */
 if (!CVframe)
 {
  printf("No CV frame captured!\n");
  cin.get();
 }

 /* resize buffer for current frame */
 IplImage* destination = cvCreateImage(cvSize(w, h), CVframe->depth, CVframe->nChannels);

 //use cvResize to resize source to a destination image
 cvResize(CVframe, destination);

 IplImage* redchannel = cvCreateImage(cvGetSize(destination), 8, 1);
 IplImage* greenchannel = cvCreateImage(cvGetSize(destination), 8, 1);
 IplImage* bluechannel = cvCreateImage(cvGetSize(destination), 8, 1);

 cvSplit(destination, bluechannel, greenchannel, redchannel, NULL);
 for(int y = 0; y < destination->height; y++)
 {
  char* line = buffer + y * bytespan;
  for(int x = 0; x < destination->width; x++)
  {
   line[0] = cvGetReal2D(redchannel, y, x);
   line[1] = cvGetReal2D(greenchannel, y, x);
   line[2] = cvGetReal2D(bluechannel, y, x);
   line += 3;
  }
 }
 cvReleaseImage(&redchannel);
 cvReleaseImage(&greenchannel);
 cvReleaseImage(&bluechannel);
 cvReleaseImage(&destination);
}

所以通常它从设备捕获一个帧，创建一个要调整大小的帧并将其复制到缓冲区（RGB 或 YUV420P 对我来说是要求）。

所以我想知道我做错了什么，因为我的函数是 2 个 cpu 密集型的，可以采取什么措施来修复它？

更新：

我的函数在线程中运行：

     void ThreadCaptureFrame()
    {
        while(1){
        t.restart();
        CaptureFrame((char *)frame->data[0], videoWidth, videoHeight, frame->linesize[0]);
        AVFrame* swap = frame;
        frame = readyFrame;
        readyFrame = swap;
        spendedTime = t.elapsed();
        if(spendedTime < desiredTime){
            Sleep(desiredTime - spendedTime);
        }
    }
 }

它在 int main 的开头启动（经过一些初始化）：

boost::thread workerThread(ThreadCaptureFrame);

因此，如果可以的话，它每秒运行 24 次，它会占用核心四核的 28%。我捕获的摄像机分辨率约为 320x240。那么：如何优化呢？

原文

So I created a function (C++)

void CaptureFrame(char* buffer, int w, int h, int bytespan)
{
 /* get a frame */
 if(!cvGrabFrame(capture)){              // capture a frame 
  printf("Could not grab a frame\n\7");
  //exit(0);
 }
 CVframe =cvRetrieveFrame(capture);           // retrieve the captured frame

 /* always check */
 if (!CVframe)
 {
  printf("No CV frame captured!\n");
  cin.get();
 }

 /* resize buffer for current frame */
 IplImage* destination = cvCreateImage(cvSize(w, h), CVframe->depth, CVframe->nChannels);

 //use cvResize to resize source to a destination image
 cvResize(CVframe, destination);

 IplImage* redchannel = cvCreateImage(cvGetSize(destination), 8, 1);
 IplImage* greenchannel = cvCreateImage(cvGetSize(destination), 8, 1);
 IplImage* bluechannel = cvCreateImage(cvGetSize(destination), 8, 1);

 cvSplit(destination, bluechannel, greenchannel, redchannel, NULL);
 for(int y = 0; y < destination->height; y++)
 {
  char* line = buffer + y * bytespan;
  for(int x = 0; x < destination->width; x++)
  {
   line[0] = cvGetReal2D(redchannel, y, x);
   line[1] = cvGetReal2D(greenchannel, y, x);
   line[2] = cvGetReal2D(bluechannel, y, x);
   line += 3;
  }
 }
 cvReleaseImage(&redchannel);
 cvReleaseImage(&greenchannel);
 cvReleaseImage(&bluechannel);
 cvReleaseImage(&destination);
}

So generally it captures a frame from device, creates a frame to resize into and copies it into buffer (RGB or YUV420P is requirement for me).

So I wonder what I do wrong, because my function is way 2 cpu intensive, and what can be done to fix it?

Update:

My function is runed in thread:

     void ThreadCaptureFrame()
    {
        while(1){
        t.restart();
        CaptureFrame((char *)frame->data[0], videoWidth, videoHeight, frame->linesize[0]);
        AVFrame* swap = frame;
        frame = readyFrame;
        readyFrame = swap;
        spendedTime = t.elapsed();
        if(spendedTime < desiredTime){
            Sleep(desiredTime - spendedTime);
        }
    }
 }

which is started at the beginning of int main ( after some initialization):

boost::thread workerThread(ThreadCaptureFrame);

So if it can it runs 24 times per second, it eats 28% of core quad. cam resolution I capture is like 320x240. So: how to optimize it?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

初熏 2024-10-09 13:54:58

您可以做的事情：

不要以默认分辨率从相机拍摄图像，而是选择您想要的分辨率。
我认为您可以简单地设置 buffer = destination->imageData

这些文章可能会有所帮助：

回复收藏 0 原文

悲喜皆因你 2024-10-09 13:54:58

首先，不要每帧分配和释放图像！
这可能需要最多的时间。预先分配所有 IplImage，并仅在应用完成时释放它们。
您可以将 boost::shared_ptr 与自定义删除器结合使用，以避免需要记住释放映像。
我不明白你为什么要分裂，为什么要这样抄袭。
如果您必须复制，则只需将整个destination->imageData复制到buffer中即可。
如果是填充有问题，那么您可以像以前一样在循环中执行此操作，但直接从 destination->imageData 进行。您不需要分离颜色通道。
将cvResize 与CV_INTER_NN 结合使用。这会降低图像质量，但速度更快。

回复收藏 0 原文

药祭#氼 2024-10-09 13:54:58

我不熟悉 OpenCV，但如果我正确地读取你的代码，你会：

从相机的缓冲区读取到内存（1 次复制）
调整图像大小（1 次复制）
将图像分割为 RGB 通道（3 次复制）
重新- 将通道合并到缓冲区（1 次复制）

我认为这是很多不必要的复制，对于每个帧，您制作了 6 个图像副本（即，如果您的图像是 320x240、24 位颜色和 24fps，则您将在至少 32MB/秒，对于 1000x1000 帧，您所说的是每秒半千兆字节，请注意，这是一个非常粗略的粗略估计，具体取决于调整大小算法，可能会进行额外的复制，读取/写入未对齐的内存位置可能会产生一些开销等）。

您可能可以跳过步骤 #3 和/或 #4，尽管我对 OpenCV 不太熟悉，无法建议如何操作。

回复收藏 0 原文

~没有更多了~