如何在C++中转键4D张量?

发布于 2025-02-05 20:43:09 字数 1397 浏览 2 评论 0原文

我需要将ML模型的输入预处理成正确的形状。 为此,我需要在C ++中从ncnn转换张量。 API不提供transpose,因此我正在尝试实现自己的转置功能。

输入张量具有形状(1,640,640,3)(对于batchx,,y颜色),我需要将其重塑为形状(1,3,640,640)

如何正确有效地转张张量?

ncnn:Mat& preprocess(const cv::Mat& rgba) {
    int width = rgba.cols;
    int height = rgba.rows;

    // Build a tensor from the image input
    ncnn::Mat in = ncnn::Mat::from_pixels(rgba.data, ncnn::Mat::PIXEL_RGBA2RGB, width, height);

    // Set the current shape of the tesnor 
    in = in.reshape(1, 640, 640, 3);

    // Normalize
    const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
    in.substract_mean_normalize(0, norm_vals);

    // Prepare the transposed matrix
    ncnn::Mat transposed = new ncnn::Mat(in.w, in.c, in.h, in.d, sizeof(float));
    ncnn::Mat shape = transposed->shape();

    // Transpose
    
    for (int i = 0; i < in.w; i++) {
        for (int j = 0; j < in.h; j++) {
            for (int k = 0; k < in.d; k++) {
                for (int l = 0; l > in.c; l++) {
                    int fromIndex = ???;
                    int toIndex = ???;
                    transposed[toIndex] = in[fromIndex];
                }
            }
        }
    }

    return transposed; 
}

I need to pre-process the input of an ML model into the correct shape.
In order to do that, I need to transpose a tensor from ncnn in C++.
The API does not offer a transpose, so I am trying to implement my own transpose function.

The input tensor has the shape (1, 640, 640, 3) (for batch, x, y and color) and I need to reshape it to the shape (1, 3, 640, 640).

How do I properly and efficiently transpose the tensor?

ncnn:Mat& preprocess(const cv::Mat& rgba) {
    int width = rgba.cols;
    int height = rgba.rows;

    // Build a tensor from the image input
    ncnn::Mat in = ncnn::Mat::from_pixels(rgba.data, ncnn::Mat::PIXEL_RGBA2RGB, width, height);

    // Set the current shape of the tesnor 
    in = in.reshape(1, 640, 640, 3);

    // Normalize
    const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
    in.substract_mean_normalize(0, norm_vals);

    // Prepare the transposed matrix
    ncnn::Mat transposed = new ncnn::Mat(in.w, in.c, in.h, in.d, sizeof(float));
    ncnn::Mat shape = transposed->shape();

    // Transpose
    
    for (int i = 0; i < in.w; i++) {
        for (int j = 0; j < in.h; j++) {
            for (int k = 0; k < in.d; k++) {
                for (int l = 0; l > in.c; l++) {
                    int fromIndex = ???;
                    int toIndex = ???;
                    transposed[toIndex] = in[fromIndex];
                }
            }
        }
    }

    return transposed; 
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

最单纯的乌龟 2025-02-12 20:43:09

我只是在谈论索引计算,而不是我不熟悉的NCNN API。

您将

fromIndex = i*A + j*B + k*C + l*D;
  toIndex = i*E + j*F + k*G + l*H; 

基于源和目标布局计算ABCDEFG H的位置。如何?

让我们首先看一个简单的2D换位。将HW布局矩阵转换为WH布局矩阵(最慢的更改尺寸):

  for (int i = 0; i < h; ++i) {
      for (int j = 0; j < w; ++j) {
          int fromIndex = i * w + j * 1;
          //              ^       ^
          //              |       |
          //             i<h     j<w        <---- hw layout

          int   toIndex = j * h + i * 1;
          //              ^       ^
          //              |       |
          //             j<w     i<h        <---- wh layout
      }      
  }      

因此,当计算fromindex时,您从源布局(hw)开始,您可以删除第一个字母(H),剩下的内容(w)是您与i随附的系数,您删除了下一个字母(w),剩下的(1)是您的系数与j。不难看到同样的模式在任何数量的维度上都起作用。例如,如果您的源布局是DCHW,那么您有

fromIndex = i * (c*h*w) + j * (h*w) + k * (w) + l * (1);
//          ^             ^           ^         ^
//          |             |           |         |
//         i<d           j<c         k<h       l<w   <---- dchw

关于toIndex的信息吗?同样的事情,但重新排列了目标布局中最慢的变化为最快变化的的字母。例如,如果您的目标布局为HWCD,则订单将为klj i(因为我是源和目标布局等[0..D)范围范围范围的索引。因此,

  toIndex = k * (w*c*d) + l * (c*d) + j * (d) + i * (1);
  //        ^             ^           ^         ^
  //        |             |           |         |
  //       k<h           l<w         j<c       i<d   <---- hwcd

我没有故意使用您的布局。进行自己的计算几次。您想对此事情发展一些直觉。

I'm only talking about index calculations, not the ncnn API which I'm not familiar with.

You set

fromIndex = i*A + j*B + k*C + l*D;
  toIndex = i*E + j*F + k*G + l*H; 

where you compute A B C D E F G H based on the source and target layout. How?

Let's look at a simple 2D transposition first. Transpose a hw layout matrix to a wh layout matrix (slowest changing dimension first):

  for (int i = 0; i < h; ++i) {
      for (int j = 0; j < w; ++j) {
          int fromIndex = i * w + j * 1;
          //              ^       ^
          //              |       |
          //             i<h     j<w        <---- hw layout

          int   toIndex = j * h + i * 1;
          //              ^       ^
          //              |       |
          //             j<w     i<h        <---- wh layout
      }      
  }      

So when computing fromIndex, you start with the source layout (hw), you remove the first letter (h) and what remains (w) is your coefficient that goes with i, and you remove the next letter (w) and what remains (1) is your coefficient that goes with j. It is not hard to see that the same kind of pattern works in any number of dimensions. For example, if your source layout is dchw, then you have

fromIndex = i * (c*h*w) + j * (h*w) + k * (w) + l * (1);
//          ^             ^           ^         ^
//          |             |           |         |
//         i<d           j<c         k<h       l<w   <---- dchw

What about toIndex? Same thing but rearrange the letters from the slowest-changing to the fastest-changing in the target layout. For example, if your target layout is hwcd, then the order will be k l j i (because i is the index that ranges over [0..d), in both source and target layouts, etc). So

  toIndex = k * (w*c*d) + l * (c*d) + j * (d) + i * (1);
  //        ^             ^           ^         ^
  //        |             |           |         |
  //       k<h           l<w         j<c       i<d   <---- hwcd

I did not use your layouts on purpose. Do your own calculations a couple of times. You want to develop some intuition about this thing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文