如何在C++中转键4D张量?
我需要将ML模型的输入预处理成正确的形状。 为此,我需要在C ++中从ncnn
转换张量。 API不提供transpose
,因此我正在尝试实现自己的转置功能。
输入张量具有形状(1,640,640,3)
(对于batch
,x
,,y
和颜色
),我需要将其重塑为形状(1,3,640,640)
。
如何正确有效地转张张量?
ncnn:Mat& preprocess(const cv::Mat& rgba) {
int width = rgba.cols;
int height = rgba.rows;
// Build a tensor from the image input
ncnn::Mat in = ncnn::Mat::from_pixels(rgba.data, ncnn::Mat::PIXEL_RGBA2RGB, width, height);
// Set the current shape of the tesnor
in = in.reshape(1, 640, 640, 3);
// Normalize
const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
in.substract_mean_normalize(0, norm_vals);
// Prepare the transposed matrix
ncnn::Mat transposed = new ncnn::Mat(in.w, in.c, in.h, in.d, sizeof(float));
ncnn::Mat shape = transposed->shape();
// Transpose
for (int i = 0; i < in.w; i++) {
for (int j = 0; j < in.h; j++) {
for (int k = 0; k < in.d; k++) {
for (int l = 0; l > in.c; l++) {
int fromIndex = ???;
int toIndex = ???;
transposed[toIndex] = in[fromIndex];
}
}
}
}
return transposed;
}
I need to pre-process the input of an ML model into the correct shape.
In order to do that, I need to transpose a tensor from ncnn
in C++.
The API does not offer a transpose
, so I am trying to implement my own transpose function.
The input tensor has the shape (1, 640, 640, 3)
(for batch
, x
, y
and color
) and I need to reshape it to the shape (1, 3, 640, 640)
.
How do I properly and efficiently transpose the tensor?
ncnn:Mat& preprocess(const cv::Mat& rgba) {
int width = rgba.cols;
int height = rgba.rows;
// Build a tensor from the image input
ncnn::Mat in = ncnn::Mat::from_pixels(rgba.data, ncnn::Mat::PIXEL_RGBA2RGB, width, height);
// Set the current shape of the tesnor
in = in.reshape(1, 640, 640, 3);
// Normalize
const float norm_vals[3] = {1 / 255.f, 1 / 255.f, 1 / 255.f};
in.substract_mean_normalize(0, norm_vals);
// Prepare the transposed matrix
ncnn::Mat transposed = new ncnn::Mat(in.w, in.c, in.h, in.d, sizeof(float));
ncnn::Mat shape = transposed->shape();
// Transpose
for (int i = 0; i < in.w; i++) {
for (int j = 0; j < in.h; j++) {
for (int k = 0; k < in.d; k++) {
for (int l = 0; l > in.c; l++) {
int fromIndex = ???;
int toIndex = ???;
transposed[toIndex] = in[fromIndex];
}
}
}
}
return transposed;
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我只是在谈论索引计算,而不是我不熟悉的NCNN API。
您将
基于源和目标布局计算
ABCDEFG H
的位置。如何?让我们首先看一个简单的2D换位。将HW布局矩阵转换为WH布局矩阵(最慢的更改尺寸):
因此,当计算
fromindex
时,您从源布局(hw)开始,您可以删除第一个字母(H),剩下的内容(w)是您与i随附的系数,您删除了下一个字母(w),剩下的(1)是您的系数与j。不难看到同样的模式在任何数量的维度上都起作用。例如,如果您的源布局是DCHW,那么您有关于
toIndex
的信息吗?同样的事情,但重新排列了目标布局中最慢的变化为最快变化的的字母。例如,如果您的目标布局为HWCD,则订单将为klj i
(因为我是源和目标布局等[0..D)范围范围范围的索引。因此,我没有故意使用您的布局。进行自己的计算几次。您想对此事情发展一些直觉。
I'm only talking about index calculations, not the ncnn API which I'm not familiar with.
You set
where you compute
A B C D E F G H
based on the source and target layout. How?Let's look at a simple 2D transposition first. Transpose a hw layout matrix to a wh layout matrix (slowest changing dimension first):
So when computing
fromIndex
, you start with the source layout (hw), you remove the first letter (h) and what remains (w) is your coefficient that goes with i, and you remove the next letter (w) and what remains (1) is your coefficient that goes with j. It is not hard to see that the same kind of pattern works in any number of dimensions. For example, if your source layout is dchw, then you haveWhat about
toIndex
? Same thing but rearrange the letters from the slowest-changing to the fastest-changing in the target layout. For example, if your target layout is hwcd, then the order will bek l j i
(because i is the index that ranges over [0..d), in both source and target layouts, etc). SoI did not use your layouts on purpose. Do your own calculations a couple of times. You want to develop some intuition about this thing.