C++ valarray 与向量
我非常喜欢矢量。他们很聪明而且速度很快。但我知道这个叫做 valarray 的东西存在。为什么我要使用 valarray 而不是向量?我知道 valarrays 有一些语法糖,但除此之外,它们什么时候有用?
I like vectors a lot. They're nifty and fast. But I know this thing called a valarray exists. Why would I use a valarray instead of a vector? I know valarrays have some syntactic sugar, but other than that, when are they useful?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
C++11 标准规定:
请参阅 C++11 26.6.1-2。
The C++11 standard says:
See C++11 26.6.1-2.
通过
std::valarray
,您可以使用标准数学符号,例如开箱即用的v1 = a*v2 + v3
。这对于向量来说是不可能的,除非您定义自己的运算符。With
std::valarray
you can use the standard mathematical notation likev1 = a*v2 + v3
out of the box. This is not possible with vectors unless you define your own operators.std::valarray 适用于繁重的数值任务,例如计算流体动力学或计算结构动力学,其中您拥有包含数百万个、有时数千万个项目的数组,并且您可以在具有数百万个时间步长的循环中迭代它们。也许今天 std::vector 具有相当的性能,但大约 15 年前,如果您想编写一个高效的数值求解器,valarray 几乎是必需的。
std::valarray is intended for heavy numeric tasks, such as Computational Fluid Dynamics or Computational Structure Dynamics, in which you have arrays with millions, sometimes tens of millions of items, and you iterate over them in a loop with also millions of timesteps. Maybe today std::vector has a comparable performance but, some 15 years ago, valarray was almost mandatory if you wanted to write an efficient numeric solver.
基本上,std::valarray 是更数学意义上的向量,允许明确定义的数学运算,例如 c = a + b,而 std::vector > 是动态数组。这些名称只是具有误导性。
在向量中,维度是固定的,并且其所有元素都仅限于允许算术运算的变量,如果您用苹果创建一个向量,编译器会立即抱怨苹果不是支持算术运算的变量。
此外,诸如
sqrt()
、log()
、sin()
等数学函数接受向量作为输入并对每个向量执行运算元素,这对于具有大量内核的单指令多数据架构非常有用。例如,GPU 非常适合处理矢量。这就是为什么向量可以有更复杂的内存结构。它的元素可以分布在多个节点上,而 std::vector 基本上是一个带有指向连续内存的指针的容器。
我仍然看到人们尝试将 std::vector 与 CUDA 一起使用。不抱歉,尽管名称具有误导性,但您不能使用
std::vector
因为它不是向量。我知道,他们承认这是一个历史错误,但他们甚至没有尝试纠正这个错误。Basically
std::valarray
are vectors in a more mathematical sense that allow well defined mathematical operations likec = a + b
, whilestd::vector
are dynamic arrays. The names are just misleading.In a vector the dimension is fixed and all its elements are restricted to variables that allow arithmetic operations and if you create one with apples, the compiler will immediately complains that apples are not variables that support arithmetic operations.
Also, mathematical functions like
sqrt()
,log()
,sin()
, etc. accept vectors as input and perform the operation to each element, which is great for architectures Single Instruction Multiple Data with a very large number of cores. For example, GPUs are very good working with vectors.This is why a vector can have a much more complex memory structure. Its elements can be distributed along several nodes, while
std::vector
is basically a container with a pointer to a continuous memory.And I still see people trying to use
std::vector
with CUDA. No sorry, despite the misleading name, you can't usestd::vector
because it is not a vector. I know, they recognize it was a historical error, but they haven't even tried to correct that mistake.valarray
是一个在错误的时间出生在错误的地点的孤儿。这是一种优化的尝试,特别是针对在编写它时用于重型数学的机器 - 特别是像 Crays 这样的矢量处理器。对于矢量处理器,您通常想要做的是将单个操作应用于整个数组,然后将下一个操作应用于整个数组,依此类推,直到完成您需要做的所有事情。
然而,除非您处理相当小的数组,否则缓存的效果往往很差。在大多数现代机器上,您通常更喜欢(在可能的范围内)加载数组的一部分,对其执行所有操作,然后继续处理数组的下一部分。
valarray 还应该消除任何别名的可能性,这(至少在理论上)可以让编译器提高速度,因为它可以更自由地在寄存器中存储值。然而实际上,我完全不确定任何实际的实现是否会在很大程度上利用这一点。我怀疑这是一个先有鸡还是先有蛋的问题——没有编译器支持它就不会流行,而且只要它不流行,就没有人会费心去开发编译器来支持它。
还有一个令人眼花缭乱的(字面意思)辅助类数组可以与 valarray 一起使用。您可以使用
slice
、slice_array
、gslice
和gslice_array
来处理valarray
的各个部分>,并使其表现得像一个多维数组。您还可以使用mask_array
来“屏蔽”操作(例如,将 x 中的项目添加到 y,但仅限于 z 非零的位置)。要充分利用valarray
,您必须了解有关这些辅助类的大量知识,其中一些非常复杂,而且似乎(至少对我来说)没有很好的文档记录。底线:虽然它有辉煌的时刻,并且可以非常巧妙地完成一些事情,但也有一些很好的理由表明它是(并且几乎肯定会保持)默默无闻。
编辑(八年后,2017 年):前面的一些内容至少在某种程度上已经过时了。例如,英特尔为其编译器实现了 valarray 的优化版本。它使用英特尔集成性能基元(英特尔 IPP)来提高性能。尽管确切的性能改进无疑有所不同,但使用简单代码进行的快速测试显示,与使用
valarray
的“标准”实现编译的相同代码相比,速度提高了 2:1 左右。因此,虽然我并不完全相信 C++ 程序员将开始大量使用 valarray,但至少在某些情况下它可以提供速度改进。
valarray
is kind of an orphan that was born in the wrong place at the wrong time. It's an attempt at optimization, fairly specifically for the machines that were used for heavy-duty math when it was written -- specifically, vector processors like the Crays.For a vector processor, what you generally wanted to do was apply a single operation to an entire array, then apply the next operation to the entire array, and so on until you'd done everything you needed to do.
Unless you're dealing with fairly small arrays, however, that tends to work poorly with caching. On most modern machines, what you'd generally prefer (to the extent possible) would be to load part of the array, do all the operations on it you're going to, then move on to the next part of the array.
valarray
is also supposed to eliminate any possibility of aliasing, which (at least theoretically) lets the compiler improve speed because it's more free to store values in registers. In reality, however, I'm not at all sure that any real implementation takes advantage of this to any significant degree. I suspect it's rather a chicken-and-egg sort of problem -- without compiler support it didn't become popular, and as long as it's not popular, nobody's going to go to the trouble of working on their compiler to support it.There's also a bewildering (literally) array of ancillary classes to use with valarray. You get
slice
,slice_array
,gslice
andgslice_array
to play with pieces of avalarray
, and make it act like a multi-dimensional array. You also getmask_array
to "mask" an operation (e.g. add items in x to y, but only at the positions where z is non-zero). To make more than trivial use ofvalarray
, you have to learn a lot about these ancillary classes, some of which are pretty complex and none of which seems (at least to me) very well documented.Bottom line: while it has moments of brilliance, and can do some things pretty neatly, there are also some very good reasons that it is (and will almost certainly remain) obscure.
Edit (eight years later, in 2017): Some of the preceding has become obsolete to at least some degree. For one example, Intel has implemented an optimized version of valarray for their compiler. It uses the Intel Integrated Performance Primitives (Intel IPP) to improve performance. Although the exact performance improvement undoubtedly varies, a quick test with simple code shows around a 2:1 improvement in speed, compared to identical code compiled with the "standard" implementation of
valarray
.So, while I'm not entirely convinced that C++ programmers will be starting to use
valarray
in huge numbers, there are least some circumstances in which it can provide a speed improvement.Valarrays(值数组)旨在将 Fortran 的一些速度带入 C++。您不会创建指针的 valarray,以便编译器可以对代码做出假设并更好地优化它。 (Fortran 如此快的主要原因是没有指针类型,因此不能存在指针别名。)
Valarray 还具有允许您以相当简单的方式对它们进行切片的类,尽管标准的该部分可以使用多一点工作。调整它们的大小具有破坏性,并且
它们缺少迭代器它们从 C++11 开始就有迭代器。因此,如果您正在处理的是数字并且方便性并不是那么重要,请使用 valarrays。否则,向量会方便得多。
Valarrays (value arrays) are intended to bring some of the speed of Fortran to C++. You wouldn't make a valarray of pointers so the compiler can make assumptions about the code and optimise it better. (The main reason that Fortran is so fast is that there is no pointer type so there can be no pointer aliasing.)
Valarrays also have classes which allow you to slice them up in a reasonably easy way although that part of the standard could use a bit more work. Resizing them is destructive and
they lack iteratorsthey have iterators since C++11.So, if it's numbers you are working with and convenience isn't all that important use valarrays. Otherwise, vectors are just a lot more convenient.
在 C++98 标准化期间,valarray 被设计为允许某种快速数学计算。然而,大约在那个时候,Todd Veldhuizen 发明了表达式模板并创建了 blitz++ 和类似的 template-meta技术的发明使得 valarray 在标准发布之前就已经过时了。 IIRC,valarray 的最初提议者在标准化过程中放弃了它,这(如果是真的)也没有帮助。
ISTR 认为,它没有从标准中删除的主要原因是没有人花时间彻底评估该问题并编写删除它的提案。
但是,请记住,所有这些都是模糊记得的道听途说。对此持保留态度,并希望有人纠正或证实这一点。
During the standardization of C++98, valarray was designed to allow some sort of fast mathematical computations. However, around that time Todd Veldhuizen invented expression templates and created blitz++, and similar template-meta techniques were invented, which made valarrays pretty much obsolete before the standard was even released. IIRC, the original proposer(s) of valarray abandoned it halfway into the standardization, which (if true) didn't help it either.
ISTR that the main reason it wasn't removed from the standard is that nobody took the time to evaluate the issue thoroughly and write a proposal to remove it.
Please keep in mind, however, that all this is vaguely remembered hearsay. Take this with a grain of salt and hope someone corrects or confirms this.
但我不得不说,我不认为
std::valarrays
有太多语法糖。语法不同,但我不会将这种差异称为“糖”。 API 很奇怪。 C++ 编程语言中有关std::valarray
的部分提到了这个不寻常的 API 以及以下事实:由于std::valarray
是由于预计会得到高度优化,因此您在使用它们时收到的任何错误消息可能都是不直观的。出于好奇,大约一年前,我将
std::valarray
与std::vector
进行了比较。我不再有代码或精确的结果(尽管编写自己的代码应该不难)。使用 GCC 在使用std::valarray
进行简单数学计算时,我确实获得了一些性能优势,但在我的实现中计算标准偏差(当然,还有标准偏差)就数学而言,没那么复杂)。我怀疑对大型(< strong>注意,根据 musiphil 的建议,我成功地从std::vector
中每个项目的操作比对std::valarray
的操作在缓存中效果更好。获得了几乎相同的性能矢量
和valarray
)。最终,我决定使用
std::vector
,同时密切关注内存分配和临时对象创建等问题。两者都是
std::vector
code> 和 std::valarray 将数据存储在连续的块中。但是,它们使用不同的模式访问该数据,更重要的是,std::valarray
的 API 鼓励与std::vector
的 API 不同的访问模式。对于标准差示例,在特定步骤中,我需要找到集合的平均值以及每个元素的值与平均值之间的差值。
对于
std::valarray
,我做了类似的事情:我可能使用
std::slice
或std::gslice
更聪明。到现在已经五年多了。对于
std::vector
,我做了一些类似的事情:今天我肯定会用不同的方式写。如果不出意外,我会利用 C++11 lambda。
很明显,这两个代码片段做了不同的事情。首先,
std::vector
示例不会像std::valarray
示例那样创建中间集合。但是,我认为比较它们是公平的,因为这些差异与 std::vector 和 std::valarray 之间的差异有关。当我写这个答案时,我怀疑从两个
std::valarray
中减去元素的值(std::valarray
示例中的最后一行)会减少缓存- 比 std::vector 示例中的相应行(恰好也是最后一行)友好。事实证明,
与
std::vector
示例执行相同的操作,并且具有几乎相同的性能。最后,问题是您更喜欢哪个 API。I have to say that I don't think
std::valarrays
have much in way of syntactic sugar. The syntax is different, but I wouldn't call the difference "sugar." The API is weird. The section onstd::valarray
s in The C++ Programming Language mentions this unusual API and the fact that, sincestd::valarray
s are expected to be highly optimized, any error messages you get while using them will probably be non-intuitive.Out of curiosity, about a year ago I pitted
std::valarray
againststd::vector
. I no longer have the code or the precise results (although it shouldn't be hard to write your own). Using GCC I did get a little performance benefit when usingstd::valarray
for simple math, but not for my implementations to calculate standard deviation (and, of course, standard deviation isn't that complex, as far as math goes).I suspect that operations on each item in a large(NOTE, following advice from musiphil, I've managed to get almost identical performance fromstd::vector
play better with caches than operations onstd::valarray
s.vector
andvalarray
).In the end, I decided to use
std::vector
while paying close attention to things like memory allocation and temporary object creation.Both
std::vector
andstd::valarray
store the data in a contiguous block. However, they access that data using different patterns, and more importantly, the API forstd::valarray
encourages different access patterns than the API forstd::vector
.For the standard deviation example, at a particular step I needed to find the collection's mean and the difference between each element's value and the mean.
For the
std::valarray
, I did something like:I may have been more clever with
std::slice
orstd::gslice
. It's been over five years now.For
std::vector
, I did something along the lines of:Today I would certainly write that differently. If nothing else, I would take advantage of C++11 lambdas.
It's obvious that these two snippets of code do different things. For one, the
std::vector
example doesn't make an intermediate collection like thestd::valarray
example does. However, I think it's fair to compare them because the differences are tied to the differences betweenstd::vector
andstd::valarray
.When I wrote this answer, I suspected that subtracting the value of elements from two
std::valarray
s (last line in thestd::valarray
example) would be less cache-friendly than the corresponding line in thestd::vector
example (which happens to also be the last line).It turns out, however, that
Does the same thing as the
std::vector
example, and has almost identical performance. In the end, the question is which API you prefer.valarray 应该让一些 FORTRAN 向量处理优点在 C++ 上发挥作用。不知何故,必要的编译器支持从未真正发生过。
Josuttis 书籍包含一些关于 valarray 的有趣(有些贬义)评论(此处 和此处)。
然而,英特尔现在似乎在最近的编译器版本中重新审视了 valarray(例如,请参阅幻灯片 9 );这是一个有趣的发展,因为他们的 4 路 SIMD SSE 指令集即将与 8 路 AVX 和 16 路 Larrabee 指令结合在一起,并且为了可移植性,使用像这样的抽象进行编码可能会更好valarray 比(比如说)内在函数。
valarray was supposed to let some FORTRAN vector-processing goodness rub off on C++. Somehow the necessary compiler support never really happened.
The Josuttis books contains some interesting (somewhat disparaging) commentary on valarray (here and here).
However, Intel now seem to be revisiting valarray in their recent compiler releases (e.g see slide 9); this is an interesting development given that their 4-way SIMD SSE instruction set is about to be joined by 8-way AVX and 16-way Larrabee instructions and in the interests of portability it'll likely be much better to code with an abstraction like valarray than (say) intrinsics.
我发现了 valarray 的一个很好的用法。
就是像 numpy 数组一样使用 valarray。
我们可以用 valarray 来实现上面的内容。
另外,我们需要 python 脚本。
I found one good usage for valarray.
It's to use valarray just like numpy arrays.
We can implement above with valarray.
Also, we need python script.