聚合计数计数器

发布于 2024-10-19 17:20:27 字数 1013 浏览 2 评论 0原文

很多时候,我发现自己使用 Tally[ ] 来计算出现次数,然后,一旦我放弃了原始列表,就必须将另一个列表的结果添加(并加入)到该计数器列表中。

这种情况通常发生在我计算配置、出现次数、进行一些离散统计等时。

因此,我为 Tally 聚合定义了一个非常简单但方便的函数:

aggTally[listUnTallied__List:{}, 
         listUnTallied1_List,
         listTallied_List] := 
 Join[Tally@Join[listUnTallied, listUnTallied1], listTallied] //. 
     {a___, {x_, p_}, b___, {x_, q_}, c___} -> {a, {x, p + q}, b, c};

这​​样

l = {x, y, z}; lt = Tally@l;
n = {x};
m = {x, y, t};

aggTally[n, {}]
  {{x, 1}}

aggTally[m, n, {}]
  {{x, 2}, {y, 1}, {t, 1}}

aggTally[m, n, lt]
  {{x, 3}, {y, 2}, {t, 1}, {z, 1}}

该函数有两个问题:

1)性能

Timing[Fold[aggTally[Range@#2, #1] &, {}, Range[100]];]
  {23.656, Null}
(* functional equivalent to *)
Timing[s = {}; j = 1; While[j < 100, s = aggTally[Range@j, s]; j++]]
  {23.047, Null}

2)它不验证最后一个参数是一个真实计数列表或空(但对我来说不太重要)

是否有一个简单,优雅,更快,更有效的解决方案? (我知道这些要求太多,但愿望是免费的)

Many times I find myself counting occurrences with Tally[ ] and then, once I discarded the original list, having to add (and join) to that counters list the results from another list.

This typically happens when I am counting configurations, occurrences, doing some discrete statistics, etc.

So I defined a very simple but handy function for Tally aggregation:

aggTally[listUnTallied__List:{}, 
         listUnTallied1_List,
         listTallied_List] := 
 Join[Tally@Join[listUnTallied, listUnTallied1], listTallied] //. 
     {a___, {x_, p_}, b___, {x_, q_}, c___} -> {a, {x, p + q}, b, c};

Such that

l = {x, y, z}; lt = Tally@l;
n = {x};
m = {x, y, t};

aggTally[n, {}]
  {{x, 1}}

aggTally[m, n, {}]
  {{x, 2}, {y, 1}, {t, 1}}

aggTally[m, n, lt]
  {{x, 3}, {y, 2}, {t, 1}, {z, 1}}

This function has two problems:

1) Performance

Timing[Fold[aggTally[Range@#2, #1] &, {}, Range[100]];]
  {23.656, Null}
(* functional equivalent to *)
Timing[s = {}; j = 1; While[j < 100, s = aggTally[Range@j, s]; j++]]
  {23.047, Null}

2) It does not validate that the last argument is a real Tallied list or null (less important for me, though)

Is there a simple, elegant, faster and more effective solution? (I understand that these are too many requirements, but wishing is free)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

放我走吧 2024-10-26 17:20:27

也许,这会满足您的需求?

aggTallyAlt[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
{#[[1, 1]], Total@#[[All, 2]]} & /@ 
       GatherBy[Join[Tally@Join[listUnTallied, listUnTallied1], listTallied], First]

计时要好得多,并且对最后一个参数进行基于模式的检查。

编辑:

这是一个更快的版本:

aggTallyAlt1[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
Transpose[{#[[All, 1, 1]], Total[#[[All, All, 2]], {2}]}] &@
   GatherBy[Join[Tally@Join[listUnTallied, listUnTallied1], listTallied], First]

它的计时:

In[39]:= Timing[Fold[aggTallyAlt1[Range@#2, #1] &, {}, Range[100]];]
Timing[s = {}; j = 1; While[j < 100, s = aggTallyAlt1[Range@j, s]; j++]]

Out[39]= {0.015, Null}

Out[40]= {0.016, Null}

Perhaps, this will suit your needs?

aggTallyAlt[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
{#[[1, 1]], Total@#[[All, 2]]} & /@ 
       GatherBy[Join[Tally@Join[listUnTallied, listUnTallied1], listTallied], First]

The timings are much better, and there is a pattern-based check on the last arg.

EDIT:

Here is a faster version:

aggTallyAlt1[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
Transpose[{#[[All, 1, 1]], Total[#[[All, All, 2]], {2}]}] &@
   GatherBy[Join[Tally@Join[listUnTallied, listUnTallied1], listTallied], First]

The timings for it:

In[39]:= Timing[Fold[aggTallyAlt1[Range@#2, #1] &, {}, Range[100]];]
Timing[s = {}; j = 1; While[j < 100, s = aggTallyAlt1[Range@j, s]; j++]]

Out[39]= {0.015, Null}

Out[40]= {0.016, Null}
紫罗兰の梦幻 2024-10-26 17:20:27

以下解决方案只是对原始函数的一个小修改。它在使用 ReplaceRepeated 之前应用 Sort,因此可以使用不太通用的替换模式,从而使其速度更快:

aggTally[listUnTallied__List : {}, listUnTallied1_List, 
   listTallied : {{_, _Integer} ...}] := 
  Sort[Join[Tally@Join[listUnTallied, listUnTallied1], 
     listTallied]] //. {a___, {x_, p_}, {x_, q_}, c___} -> {a, {x, p + q}, c};

The following solution is just a small modification of your original function. It applies Sort before using ReplaceRepeated and can thus use a less general replacement pattern which makes it much faster:

aggTally[listUnTallied__List : {}, listUnTallied1_List, 
   listTallied : {{_, _Integer} ...}] := 
  Sort[Join[Tally@Join[listUnTallied, listUnTallied1], 
     listTallied]] //. {a___, {x_, p_}, {x_, q_}, c___} -> {a, {x, p + q}, c};
一个人练习一个人 2024-10-26 17:20:27

这是我迄今为止想出的最快的方法,(ab)使用 SowReap 可用的标记:

aggTally5[untallied___List, tallied_List: {}] :=
  Last[Reap[
    Scan[((Sow[#2, #] &) @@@ Tally[#]) &, {untallied}];
    Sow[#2, #] & @@@ tallied;
    , _, {#, Total[#2]} &]]

不会赢得任何选美比赛,但这都是关于速度,对吧? =)

Here's the fastest thing I've come up with yet, (ab)using the tagging available with Sow and Reap:

aggTally5[untallied___List, tallied_List: {}] :=
  Last[Reap[
    Scan[((Sow[#2, #] &) @@@ Tally[#]) &, {untallied}];
    Sow[#2, #] & @@@ tallied;
    , _, {#, Total[#2]} &]]

Not going to win any beauty contests, but it's all about speed, right? =)

提赋 2024-10-26 17:20:27

如果您纯粹是象征性的,您可以尝试一些类似于

(Plus @@ Times @@@ Join[#1, #2] /. Plus -> List /. Times -> List) &

加入计数列表的操作。这是愚蠢的快,但返回的东西不是计数列表,所以它需要一些工作(之后它可能不再那么快了;))。

编辑:所以我有一个工作版本:

aggT = Replace[(Plus @@ Times @@@ Join[#1, #2] 
                  /. Plus -> List 
                  /. Times[a_, b_] :> List[b, a]), 
                k_Symbol -> List[k, 1], {1}] &;

使用几个随机符号表我得到

a := Tally@b;
b := Table[f[RandomInteger@99 + 1], {i, 100}];

Timing[Fold[aggT[#1, #2] &, a, Table[a, {i, 100}]];]
--> {0.104954, Null}

这个版本只添加计数列表,不检查任何内容,仍然返回一些整数,并与 Leonid 的函数进行比较:

Timing[Fold[aggTallyAlt1[#2, #1] &, a, Table[b, {i, 100}]];]
--> {0.087039, Null}

它已经是几秒钟了慢一些:-(。

哦,好吧,不错的尝试。

If you stay purely symbolic, you may try something along the lines of

(Plus @@ Times @@@ Join[#1, #2] /. Plus -> List /. Times -> List) &

for joining tally lists. This is stupid fast but returns something that isn't a tally list, so it needs some work (after which it may not be so fast anymore ;) ).

EDIT: So I've got a working version:

aggT = Replace[(Plus @@ Times @@@ Join[#1, #2] 
                  /. Plus -> List 
                  /. Times[a_, b_] :> List[b, a]), 
                k_Symbol -> List[k, 1], {1}] &;

Using a couple of random symbolic tables I get

a := Tally@b;
b := Table[f[RandomInteger@99 + 1], {i, 100}];

Timing[Fold[aggT[#1, #2] &, a, Table[a, {i, 100}]];]
--> {0.104954, Null}

This version only adds tally lists, doesn't check anything, still returns some integers, and comparing to Leonid's function:

Timing[Fold[aggTallyAlt1[#2, #1] &, a, Table[b, {i, 100}]];]
--> {0.087039, Null}

it's already a couple of seconds slower :-(.

Oh well, nice try.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文