检查列表是否是函数参数中的数字列表的推荐方法是什么?

发布于 2024-12-25 13:13:33 字数 959 浏览 1 评论 0 原文

我一直在研究检查函数参数的方法。我注意到 MatrixQ 有 2 个参数,第二个参数是应用于每个元素的测试。

ListQ 仅采用一个参数。 (也由于某种原因,?ListQ 没有像 ?MatrixQ 那样的帮助页面)。

因此,例如,要检查函数的参数是否是数字矩阵,我会写

ClearAll[foo]
foo[a_?(MatrixQ[#, NumberQ] &)] := Module[{}, a + 1]

What would be a good way to do the same for a List?下面的内容仅检查输入是否是一个列表,

ClearAll[foo]
foo[a_?(ListQ[#] &)] := Module[{}, a + 1]

我可以这样做:

ClearAll[foo]
foo[a_?(ListQ[#] && (And @@ Map[NumberQ[#] &, # ]) &)] := Module[{}, a + 1]

这样 foo[{1, 2, 3}] 可以工作,但是 foo[{1, 2, x}] 不会(假设 x 是一个符号)。但在我看来,这样做的方法很复杂。

问题:您是否知道更好的方法来检查参数是否为列表,并检查列表内容是否为数字(或 Mathematica 已知的任何其他头?)

还有一个相关问题:向每个参数添加此类检查会带来任何主要的运行时性能问题吗?如果是这样,您是否建议在测试和开发完成后删除这些检查,以便最终程序运行得更快? (例如,有一个包含所有签入的代码版本用于开发/测试,而一个版本则没有用于生产)。

I've been looking at the ways to check arguments of functions. I noticed that
MatrixQ takes 2 arguments, the second is a test to apply to each element.

But ListQ only takes one argument. (also for some reason, ?ListQ does not have a help page, like ?MatrixQ does).

So, for example, to check that an argument to a function is a matrix of numbers, I write

ClearAll[foo]
foo[a_?(MatrixQ[#, NumberQ] &)] := Module[{}, a + 1]

What would be a good way to do the same for a List? This below only checks that the input is a List

ClearAll[foo]
foo[a_?(ListQ[#] &)] := Module[{}, a + 1]

I could do something like this:

ClearAll[foo]
foo[a_?(ListQ[#] && (And @@ Map[NumberQ[#] &, # ]) &)] := Module[{}, a + 1]

so that foo[{1, 2, 3}] will work, but foo[{1, 2, x}] will not (assuming x is a symbol). But it seems to me to be someone complicated way to do this.

Question: Do you know a better way to check that an argument is a list and also check the list content to be Numbers (or of any other Head known to Mathematica?)

And a related question: Any major run-time performance issues with adding such checks to each argument? If so, do you recommend these checks be removed after testing and development is completed so that final program runs faster? (for example, have a version of the code with all the checks in, for the development/testing, and a version without for production).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

我三岁 2025-01-01 13:13:33

您可以以与 MatrixQ 完全类似的方式使用 VectorQ。例如,

f[vector_ /; VectorQ[vector, NumericQ]] := ...

另请注意 VectorQListQ 之间的两个区别:

  1. 普通的 VectorQ(没有第二个参数)仅在以下情况下给出 true:列表的元素本身就是列表(即仅适用于一维结构)

  2. VectorQ 将处理 SparseArray,而 ListQ 则不会


我不确定在实践中使用它们对性能的影响,我自己对此很好奇。

这是一个幼稚的基准。我正在比较两个函数:一个只检查参数,但不执行任何操作,另一个函数将两个向量相加(这是一个非常快的内置操作,即任何比这更快的操作都可以被认为可以忽略不计)。我使用的是 NumericQ,它是比 NumberQ 更复杂(因此可能更慢)的检查。

In[2]:= add[a_ /; VectorQ[a, NumericQ], b_ /; VectorQ[b, NumericQ]] :=
  a + b

In[3]:= nothing[a_ /; VectorQ[a, NumericQ], 
  b_ /; VectorQ[b, NumericQ]] := Null

打包数组。可以验证该检查是恒定时间的(此处未显示)。

In[4]:= rr = RandomReal[1, 10000000];

In[5]:= Do[add[rr, rr], {10}]; // Timing

Out[5]= {1.906, Null}

In[6]:= Do[nothing[rr, rr], {10}]; // Timing

Out[6]= {0., Null}

同质非压缩数组。检查是线性时间,但速度非常快。

In[7]:= rr2 = Developer`FromPackedArray@RandomInteger[10000, 1000000];

In[8]:= Do[add[rr2, rr2], {10}]; // Timing

Out[8]= {1.75, Null}

In[9]:= Do[nothing[rr2, rr2], {10}]; // Timing

Out[9]= {0.204, Null}

非同质非填充阵列。检查所需的时间与前面的示例相同。

In[10]:= rr3 = Join[rr2, {Pi, 1.0}];

In[11]:= Do[add[rr3, rr3], {10}]; // Timing

Out[11]= {5.625, Null}

In[12]:= Do[nothing[rr3, rr3], {10}]; // Timing

Out[12]= {0.282, Null}

基于这个非常简单的示例的结论:

  1. VectorQ 是高度优化的,至少在使用常见的第二个参数时是如此。它比添加两个向量要快得多,这本身就是一个很好的优化操作。
  2. 对于打包数组,VectorQ 是常数时间。

@Leonid的回答也很相关,请看一下。

You might use VectorQ in a way completely analogous to MatrixQ. For example,

f[vector_ /; VectorQ[vector, NumericQ]] := ...

Also note two differences between VectorQ and ListQ:

  1. A plain VectorQ (with no second argument) only gives true if no elements of the list are lists themselves (i.e. only for 1D structures)

  2. VectorQ will handle SparseArrays while ListQ will not


I am not sure about the performance impact of using these in practice, I am very curious about that myself.

Here's a naive benchmark. I am comparing two functions: one that only checks the arguments, but does nothing, and one that adds two vectors (this is a very fast built-in operation, i.e. anything faster than this could be considered negligible). I am using NumericQ which is a more complex (therefore potentially slower) check than NumberQ.

In[2]:= add[a_ /; VectorQ[a, NumericQ], b_ /; VectorQ[b, NumericQ]] :=
  a + b

In[3]:= nothing[a_ /; VectorQ[a, NumericQ], 
  b_ /; VectorQ[b, NumericQ]] := Null

Packed array. It can be verified that the check is constant time (not shown here).

In[4]:= rr = RandomReal[1, 10000000];

In[5]:= Do[add[rr, rr], {10}]; // Timing

Out[5]= {1.906, Null}

In[6]:= Do[nothing[rr, rr], {10}]; // Timing

Out[6]= {0., Null}

Homogeneous non-packed array. The check is linear time, but very fast.

In[7]:= rr2 = Developer`FromPackedArray@RandomInteger[10000, 1000000];

In[8]:= Do[add[rr2, rr2], {10}]; // Timing

Out[8]= {1.75, Null}

In[9]:= Do[nothing[rr2, rr2], {10}]; // Timing

Out[9]= {0.204, Null}

Non-homogeneous non-packed array. The check takes the same time as in the previous example.

In[10]:= rr3 = Join[rr2, {Pi, 1.0}];

In[11]:= Do[add[rr3, rr3], {10}]; // Timing

Out[11]= {5.625, Null}

In[12]:= Do[nothing[rr3, rr3], {10}]; // Timing

Out[12]= {0.282, Null}

Conclusion based on this very simple example:

  1. VectorQ is highly optimized, at least when using common second arguments. It's much faster than e.g. adding two vectors, which itself is a well optimized operation.
  2. For packed arrays VectorQ is constant time.

@Leonid's answer is very relevant too, please see it.

林空鹿饮溪 2025-01-01 13:13:33

关于性能影响(因为你的第一个问题已经得到解答)-无论如何,都要进行检查,但是在你的顶级函数中(直接从你的功能的用户接收数据。用户也可以是另一个独立的模块,由您或其他人编写)。不要将这些检查放在所有中间函数中,因为此类检查将是重复的并且实际上是不合理的。

编辑

为了解决@Nasser在评论中提出的中间函数中的错误问题:有一种非常简单的技术,可以让人们“一键”打开和关闭模式检查。您可以将模式存储在包内的变量中,这些变量在函数定义之前定义。

下面是一个示例,其中 f 是顶级函数,而 gh 是“内部函数”。我们定义两种模式:用于主函数和内部函数,如下所示:

Clear[nlPatt,innerNLPatt ];
nlPatt= _?(!VectorQ[#,NumericQ]&);
innerNLPatt = nlPatt;

现在,我们定义我们的函数:

ClearAll[f,g,h];
f[vector:nlPatt]:=g[vector]+h[vector];
g[nv:innerNLPatt ]:=nv^2;
h[nv:innerNLPatt ]:=nv^3;

请注意,这些模式在定义内部替换在定义时,而不是运行时,因此这完全等同于手动编码这些模式。测试后,您只需更改一行:从

innerNLPatt = nlPatt 

innerNLPatt = _

并重新加载您的包。

最后一个问题是——如何快速发现错误?我在此处回答了“”部分返回 $Failed 时,可以使用 Throw 抛出异常。”“元编程和自动化”

END编辑

我在我的书中对此问题进行了简短的讨论在这里。在该示例中,性能下降的程度是运行时间增加了 10%,IMO 认为这是可以接受的。在当前的情况下,检查更简单,性能损失也小得多。一般来说,对于任何计算密集型的函数,正确编写的类型检查只花费总运行时间的一小部分。

一些值得了解的技巧:

  • 当按语法使用时,模式匹配器可以非常快(模式中不存在 ConditionPatternTest)。

例如:

randomString[]:=FromCharacterCode@RandomInteger[{97,122},5];
rstest = Table[randomString[],{1000000}];

In[102]:= MatchQ[rstest,{__String}]//Timing
Out[102]= {0.047,True}

In[103]:= MatchQ[rstest,{__?StringQ}]//Timing
Out[103]= {0.234,True}

仅仅因为在后一种情况下使用了 PatternTest,检查速度要慢得多,因为评估器是由每个元素的模式匹配器调用的,而在第一种情况下,一切都是纯粹的语法一切都在模式匹配器内完成。


  • 对于解压数字列表也是如此(时间差异相似)。但是,对于打包数字列表,MatchQ 和其他模式测试函数不会针对某些特殊模式进行解包,而且,对于其中某些模式,检查是即时的。

这是一个例子:

In[113]:= 
test = RandomInteger[100000,1000000];

In[114]:= MatchQ[test,{__?IntegerQ}]//Timing
Out[114]= {0.203,True}

In[115]:= MatchQ[test,{__Integer}]//Timing
Out[115]= {0.,True}

In[116]:= Do[MatchQ[test,{__Integer}],{1000}]//Timing
Out[116]= {0.,Null}

显然,对于具有某些谓词的 VectorQMatrixQArrayQ 等函数来说似乎也是如此( NumericQ) - 这些测试非常有效。


  • 很大程度上取决于您如何编写测试,即您重用高效 Mathematica 结构的程度。

例如,我们想测试我们是否有一个真正的数字矩阵:

In[143]:= rm = RandomInteger[10000,{1500,1500}];

这是最直接和最慢的方法:

In[144]:= MatrixQ[rm,NumericQ[#]&&Im[#]==0&]//Timing
Out[144]= {4.125,True}

这是更好的方法,因为我们更好地重用了模式匹配器:

In[145]:= MatrixQ[rm,NumericQ]&&FreeQ[rm,Complex]//Timing
Out[145]= {0.204,True}

然而,我们没有利用矩阵的压缩性质。这仍然更好:

In[146]:= MatrixQ[rm,NumericQ]&&Total[Abs[Flatten[Im[rm]]]]==0//Timing
Out[146]= {0.047,True}

然而,这还没有结束。下面这个几乎是瞬时的:

In[147]:= MatrixQ[rm,NumericQ]&&Re[rm]==rm//Timing
Out[147]= {0.,True}

Regarding the performance hit (since your first question has been answered already) - by all means, do the checks, but in your top-level functions (which receive data directly from the user of your functionality. The user can also be another independent module, written by you or someone else). Don't put these checks in all your intermediate functions, since such checks will be duplicate and indeed unjustified.

EDIT

To address the problem of errors in intermediate functions, raised by @Nasser in the comments: there is a very simple technique which allows one to switch pattern-checks on and off in "one click". You can store your patterns in variables inside your package, defined prior to your function definitions.

Here is an example, where f is a top-level function, while g and h are "inner functions". We define two patterns: for the main function and for the inner ones, like so:

Clear[nlPatt,innerNLPatt ];
nlPatt= _?(!VectorQ[#,NumericQ]&);
innerNLPatt = nlPatt;

Now, we define our functions:

ClearAll[f,g,h];
f[vector:nlPatt]:=g[vector]+h[vector];
g[nv:innerNLPatt ]:=nv^2;
h[nv:innerNLPatt ]:=nv^3;

Note that the patterns are substituted inside definitions at definition time, not run-time, so this is exactly equivalent to coding those patterns by hand. Once you test, you just have to change one line: from

innerNLPatt = nlPatt 

to

innerNLPatt = _

and reload your package.

A final question is - how do you quickly find errors? I answered that here, in sections "Instead of returning $Failed, one can throw an exception, using Throw.", and "Meta-programming and automation".

END EDIT

I included a brief discussion of this issue in my book here. In that example, the performance hit was on the level of 10% increase of running time, which IMO is borderline acceptable. In the case at hand, the check is simpler and the performance penalty is much less. Generally, for a function which is any computationally-intensive, correctly-written type checks cost only a small fraction of the total run-time.

A few tricks which are good to know:

  • Pattern-matcher can be very fast, when used syntactically (no Condition or PatternTest present in the pattern).

For example:

randomString[]:=FromCharacterCode@RandomInteger[{97,122},5];
rstest = Table[randomString[],{1000000}];

In[102]:= MatchQ[rstest,{__String}]//Timing
Out[102]= {0.047,True}

In[103]:= MatchQ[rstest,{__?StringQ}]//Timing
Out[103]= {0.234,True}

Just because in the latter case the PatternTest was used, the check is much slower, because evaluator is invoked by the pattern-matcher for every element, while in the first case, everything is purely syntactic and all is done inside the pattern-matcher.


  • The same is true for unpacked numerical lists (the timing difference is similar). However, for packed numerical lists, MatchQ and other pattern-testing functions don't unpack for certain special patterns, moreover, for some of them the check is instantaneous.

Here is an example:

In[113]:= 
test = RandomInteger[100000,1000000];

In[114]:= MatchQ[test,{__?IntegerQ}]//Timing
Out[114]= {0.203,True}

In[115]:= MatchQ[test,{__Integer}]//Timing
Out[115]= {0.,True}

In[116]:= Do[MatchQ[test,{__Integer}],{1000}]//Timing
Out[116]= {0.,Null}

The same, apparently, seems to be true for functions like VectorQ, MatrixQ and ArrayQ with certain predicates (NumericQ) - these tests are extremely efficient.


  • A lot depends on how you write your test, i.e. to what degree you reuse the efficient Mathematica structures.

For example, we want to test that we have a real numeric matrix:

In[143]:= rm = RandomInteger[10000,{1500,1500}];

Here is the most straight-forward and slow way:

In[144]:= MatrixQ[rm,NumericQ[#]&&Im[#]==0&]//Timing
Out[144]= {4.125,True}

This is better, since we reuse the pattern-matcher better:

In[145]:= MatrixQ[rm,NumericQ]&&FreeQ[rm,Complex]//Timing
Out[145]= {0.204,True}

We did not utilize the packed nature of the matrix however. This is still better:

In[146]:= MatrixQ[rm,NumericQ]&&Total[Abs[Flatten[Im[rm]]]]==0//Timing
Out[146]= {0.047,True}

However, this is not the end. The following one is near instantaneous:

In[147]:= MatrixQ[rm,NumericQ]&&Re[rm]==rm//Timing
Out[147]= {0.,True}
土豪我们做朋友吧 2025-01-01 13:13:33

由于 ListQ 只是检查头部是否为 List,因此以下是一个简单的解决方案:

foo[a:{___?NumberQ}] := Module[{}, a + 1]

Since ListQ just checks that the head is List, the following is a simple solution:

foo[a:{___?NumberQ}] := Module[{}, a + 1]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文