我一直在研究检查函数参数的方法。我注意到
MatrixQ 有 2 个参数,第二个参数是应用于每个元素的测试。
但 ListQ
仅采用一个参数。 (也由于某种原因,?ListQ
没有像 ?MatrixQ
那样的帮助页面)。
因此,例如,要检查函数的参数是否是数字矩阵,我会写
ClearAll[foo]
foo[a_?(MatrixQ[#, NumberQ] &)] := Module[{}, a + 1]
What would be a good way to do the same for a List?下面的内容仅检查输入是否是一个列表,
ClearAll[foo]
foo[a_?(ListQ[#] &)] := Module[{}, a + 1]
我可以这样做:
ClearAll[foo]
foo[a_?(ListQ[#] && (And @@ Map[NumberQ[#] &, # ]) &)] := Module[{}, a + 1]
这样 foo[{1, 2, 3}]
可以工作,但是 foo[{1, 2, x}]
不会(假设 x
是一个符号)。但在我看来,这样做的方法很复杂。
问题:您是否知道更好的方法来检查参数是否为列表,并检查列表内容是否为数字(或 Mathematica 已知的任何其他头?)
还有一个相关问题:向每个参数添加此类检查会带来任何主要的运行时性能问题吗?如果是这样,您是否建议在测试和开发完成后删除这些检查,以便最终程序运行得更快? (例如,有一个包含所有签入的代码版本用于开发/测试,而一个版本则没有用于生产)。
I've been looking at the ways to check arguments of functions. I noticed that
MatrixQ
takes 2 arguments, the second is a test to apply to each element.
But ListQ
only takes one argument. (also for some reason, ?ListQ
does not have a help page, like ?MatrixQ
does).
So, for example, to check that an argument to a function is a matrix of numbers, I write
ClearAll[foo]
foo[a_?(MatrixQ[#, NumberQ] &)] := Module[{}, a + 1]
What would be a good way to do the same for a List? This below only checks that the input is a List
ClearAll[foo]
foo[a_?(ListQ[#] &)] := Module[{}, a + 1]
I could do something like this:
ClearAll[foo]
foo[a_?(ListQ[#] && (And @@ Map[NumberQ[#] &, # ]) &)] := Module[{}, a + 1]
so that foo[{1, 2, 3}]
will work, but foo[{1, 2, x}]
will not (assuming x
is a symbol). But it seems to me to be someone complicated way to do this.
Question: Do you know a better way to check that an argument is a list and also check the list content to be Numbers (or of any other Head known to Mathematica?)
And a related question: Any major run-time performance issues with adding such checks to each argument? If so, do you recommend these checks be removed after testing and development is completed so that final program runs faster? (for example, have a version of the code with all the checks in, for the development/testing, and a version without for production).
发布评论
评论(3)
您可以以与
MatrixQ
完全类似的方式使用VectorQ
。例如,另请注意
VectorQ
和ListQ
之间的两个区别:普通的
VectorQ
(没有第二个参数)仅在以下情况下给出 true:列表的元素本身就是列表(即仅适用于一维结构)VectorQ
将处理SparseArray
,而ListQ
则不会我不确定在实践中使用它们对性能的影响,我自己对此很好奇。
这是一个幼稚的基准。我正在比较两个函数:一个只检查参数,但不执行任何操作,另一个函数将两个向量相加(这是一个非常快的内置操作,即任何比这更快的操作都可以被认为可以忽略不计)。我使用的是
NumericQ
,它是比NumberQ
更复杂(因此可能更慢)的检查。打包数组。可以验证该检查是恒定时间的(此处未显示)。
同质非压缩数组。检查是线性时间,但速度非常快。
非同质非填充阵列。检查所需的时间与前面的示例相同。
基于这个非常简单的示例的结论:
VectorQ
是高度优化的,至少在使用常见的第二个参数时是如此。它比添加两个向量要快得多,这本身就是一个很好的优化操作。VectorQ
是常数时间。@Leonid的回答也很相关,请看一下。
You might use
VectorQ
in a way completely analogous toMatrixQ
. For example,Also note two differences between
VectorQ
andListQ
:A plain
VectorQ
(with no second argument) only gives true if no elements of the list are lists themselves (i.e. only for 1D structures)VectorQ
will handleSparseArray
s whileListQ
will notI am not sure about the performance impact of using these in practice, I am very curious about that myself.
Here's a naive benchmark. I am comparing two functions: one that only checks the arguments, but does nothing, and one that adds two vectors (this is a very fast built-in operation, i.e. anything faster than this could be considered negligible). I am using
NumericQ
which is a more complex (therefore potentially slower) check thanNumberQ
.Packed array. It can be verified that the check is constant time (not shown here).
Homogeneous non-packed array. The check is linear time, but very fast.
Non-homogeneous non-packed array. The check takes the same time as in the previous example.
Conclusion based on this very simple example:
VectorQ
is highly optimized, at least when using common second arguments. It's much faster than e.g. adding two vectors, which itself is a well optimized operation.VectorQ
is constant time.@Leonid's answer is very relevant too, please see it.
关于性能影响(因为你的第一个问题已经得到解答)-无论如何,都要进行检查,但是在你的顶级函数中(直接从你的功能的用户接收数据。用户也可以是另一个独立的模块,由您或其他人编写)。不要将这些检查放在所有中间函数中,因为此类检查将是重复的并且实际上是不合理的。
编辑
为了解决@Nasser在评论中提出的中间函数中的错误问题:有一种非常简单的技术,可以让人们“一键”打开和关闭模式检查。您可以将模式存储在包内的变量中,这些变量在函数定义之前定义。
下面是一个示例,其中
f
是顶级函数,而g
和h
是“内部函数”。我们定义两种模式:用于主函数和内部函数,如下所示:现在,我们定义我们的函数:
请注意,这些模式在定义内部替换在定义时,而不是运行时,因此这完全等同于手动编码这些模式。测试后,您只需更改一行:从
到
并重新加载您的包。
最后一个问题是——如何快速发现错误?我在此处回答了“”部分返回
$Failed
时,可以使用 Throw 抛出异常。” 和“元编程和自动化”。END编辑
我在我的书中对此问题进行了简短的讨论在这里。在该示例中,性能下降的程度是运行时间增加了 10%,IMO 认为这是可以接受的。在当前的情况下,检查更简单,性能损失也小得多。一般来说,对于任何计算密集型的函数,正确编写的类型检查只花费总运行时间的一小部分。
一些值得了解的技巧:
Condition
或PatternTest
)。例如:
仅仅因为在后一种情况下使用了
PatternTest
,检查速度要慢得多,因为评估器是由每个元素的模式匹配器调用的,而在第一种情况下,一切都是纯粹的语法一切都在模式匹配器内完成。MatchQ
和其他模式测试函数不会针对某些特殊模式进行解包,而且,对于其中某些模式,检查是即时的。这是一个例子:
显然,对于具有某些谓词的
VectorQ
、MatrixQ
和ArrayQ
等函数来说似乎也是如此(NumericQ
) - 这些测试非常有效。例如,我们想测试我们是否有一个真正的数字矩阵:
这是最直接和最慢的方法:
这是更好的方法,因为我们更好地重用了模式匹配器:
然而,我们没有利用矩阵的压缩性质。这仍然更好:
然而,这还没有结束。下面这个几乎是瞬时的:
Regarding the performance hit (since your first question has been answered already) - by all means, do the checks, but in your top-level functions (which receive data directly from the user of your functionality. The user can also be another independent module, written by you or someone else). Don't put these checks in all your intermediate functions, since such checks will be duplicate and indeed unjustified.
EDIT
To address the problem of errors in intermediate functions, raised by @Nasser in the comments: there is a very simple technique which allows one to switch pattern-checks on and off in "one click". You can store your patterns in variables inside your package, defined prior to your function definitions.
Here is an example, where
f
is a top-level function, whileg
andh
are "inner functions". We define two patterns: for the main function and for the inner ones, like so:Now, we define our functions:
Note that the patterns are substituted inside definitions at definition time, not run-time, so this is exactly equivalent to coding those patterns by hand. Once you test, you just have to change one line: from
to
and reload your package.
A final question is - how do you quickly find errors? I answered that here, in sections "Instead of returning
$Failed
, one can throw an exception, using Throw.", and "Meta-programming and automation".END EDIT
I included a brief discussion of this issue in my book here. In that example, the performance hit was on the level of 10% increase of running time, which IMO is borderline acceptable. In the case at hand, the check is simpler and the performance penalty is much less. Generally, for a function which is any computationally-intensive, correctly-written type checks cost only a small fraction of the total run-time.
A few tricks which are good to know:
Condition
orPatternTest
present in the pattern).For example:
Just because in the latter case the
PatternTest
was used, the check is much slower, because evaluator is invoked by the pattern-matcher for every element, while in the first case, everything is purely syntactic and all is done inside the pattern-matcher.MatchQ
and other pattern-testing functions don't unpack for certain special patterns, moreover, for some of them the check is instantaneous.Here is an example:
The same, apparently, seems to be true for functions like
VectorQ
,MatrixQ
andArrayQ
with certain predicates (NumericQ
) - these tests are extremely efficient.For example, we want to test that we have a real numeric matrix:
Here is the most straight-forward and slow way:
This is better, since we reuse the pattern-matcher better:
We did not utilize the packed nature of the matrix however. This is still better:
However, this is not the end. The following one is near instantaneous:
由于
ListQ
只是检查头部是否为List
,因此以下是一个简单的解决方案:Since
ListQ
just checks that the head isList
, the following is a simple solution: