使用非零值初始化双精度数组 (BLAS)
我分配了一个大的双向量,假设有 100000 个元素。在我的代码中的某个时刻,我想将所有元素设置为常量、非零值。如何在不对所有元素使用 for 循环的情况下做到这一点? 我也在使用 BLAS 包,如果有帮助的话。
I have allocated a big double vector, lets say with 100000 element. At some point in my code, I want to set all elements to a constant, nonzero value. How can I do this without using a for loop over all elements?
I am also using the BLAS package, if it helps.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以使用
std::fill
(#include
):当然,这本质上也只是一个循环。
You could use
std::fill
(#include <algorithm>
):This is essentially also only a loop of course..
你所说的“填充”是正确的。
请注意,也可以构造一个充满指定值的向量:
因此,如果“在某个点”意味着“构造后立即”,请改为这样做。另外,这意味着您可以这样做:
如果“在某个时刻”意味着“更改大小后立即”,并且您期望/希望重新分配向量(“期望”,如果您使其大于其先前的容量,如果您将其做得更小并希望对其进行修剪以节省内存,则“想要”)。
'fill' is right from what you've said.
Be aware that it's also possible to construct a vector full of a specified value:
So if "at some point" means "immediately after construction", do this instead. Also, it means you can do this:
which might be useful if "at some point" means "immediately after changing the size", and you expect/want the vector to be reallocated ("expect" if you're making it bigger than its prior capacity, "want" if you're making it much smaller and want it trimmed to save memory).
不幸的是,其他答案没有遵循提示,操作员想要将数组的元素设置为零。
使用 BLAS 而不是更惯用的函数(例如
memset
或fill
)可能有多种原因。例如,BLAS 操作可以是线程化的。
此外,memset 和 fill 不提供开箱即用的跨步操作。
乍一看,BLAS 库似乎没有提供此类功能,但是,有两种选择:
xSCAL
(x
可以对于不同的数字类型,为s
、d
、c
、z
)。SCAL
执行缩放操作V <- a*V
。对于
a = 0
,它将元素设置为零(大部分)。完整代码如下。
两种方法都有自己的问题,第一种策略依赖于任何浮点数 x 的任何
x*0.0 == 0.0
。从技术上讲,这对于 x == NAN 或 x == infinity 来说是不正确的(两种情况都有说明)。
也许 BLAS 可以以非 IEEE 兼容的方式编译,这实际上给出了这一点。
无论如何,如果由于某种原因您知道原始值是常规数字,那么您可以使用它。
另一个问题是,您可以将带符号的零(
-0.0
)作为元素,这是可以的,只是它们最终可能不都是相同的统一(例如正零,0.0
)代码>)。第二个更强大,但依赖于 BLAS 接受零步幅值。
(BLAS 是在 70 年代发明并用 Fortran 编写的,当时还没有发明整数零。)
我知道的大多数实现都允许零增量,至少对于 xCOPY 来说是这样。
它还需要从某个地方取出“第一个”零;在这个例子中刚刚在堆栈中创建。
(如果您要推广到 GPU BLAS (cuBLAS),那么您将需要在 GPU 中分配这个零。)
因此,换句话说,您必须了解您的平台和可用的 BLAS。
像这样使用。它将打印零。
另外,还有这种方式,但它依赖于 BLAS 的顺序,并且它不能并行化或矢量化任何操作来使其工作,所以它可能是最糟糕的选择。
It is unfortunate the other answers didn't follow the prompt, the OP wanted to set the elements of an array to zero.
There could be several reason to use BLAS instead of more idiomatic functions (such as
memset
orfill
).For example BLAS operations could be threaded.
Also
memset
andfill
do not offer out-of-the-box strided operations.At first glance, it seems that the BLAS library doesn't offer such functionality, however, there are two options:
xSCAL
for that (x
can bes
,d
,c
,z
for different numeric types).SCAL
does the scale operationV <- a*V
.For
a = 0
it sets the elements to zero (mostly).xCOPY
, and copy a single zero from stack memory.Full code below.
Both approaches have their own problems, the first strategy relies on any
x*0.0 == 0.0
for any floating point number x.Which is technically not true for
x == NAN
orx == infinity
(both cases illustrated).Maybe BLAS could be compiled in a non-IEEE compliant manner that in effect gives this.
In any case, if for some reason you know that the original values are regular numbers then you can use this.
Another issue is that you could get signed zeros (
-0.0
) as elements, which are ok, except that they may end up not all being the same uniform (e.g. positive zero,0.0
).The second is more robust but relies on BLAS accepting zero-stride values.
(BLAS was invented and coded in Fortran in the 70's, integer zero was not invented at the time.)
Most implementations I know would allow zero increments, at least, for
xCOPY
.It also needs to pull out the "first" zero from somewhere; in this example is just created in the stack.
(If you are generalizing to GPU BLAS (cuBLAS) then you will need to allocate this zero in the GPU.)
So, in other words, you have to know your platform and the BLAS you have available.
use like this. It will print zeros.
Bonus, there is also this way, but it relies on BLAS being sequential, and it can't parallelize or vectorized any operation for this to work, so it is probably the worst option.
如果您不想循环,则始终使用
memset()
。也就是说,memset(myarr, 5, arrsize); 以便用全 5 填充它。请注意隐式转换为 unsigned char。
如果向量很大,并且您需要它快速运行并且您正在使用 gcc,那么:
You always use
memset()
if you don't want to loop.That is,
memset(myarr, 5, arrsize);
in order to fill it with all 5's. Beware of implicit conversion to unsigned char.And if the vector is large, and you need it to go fast and you are using gcc, then :