在 Mathematica 中显示重复项

发布于 2024-08-08 15:06:15 字数 532 浏览 5 评论 0原文

在 Mathematica 中,我有一个列表:

x = {1,2,3,3,4,5,5,6}

如何制作包含重复项的列表?就像:

{3,5}

我一直在查看 Lists as Sets,如果有类似 except 的内容[] 对于列表,所以我可以这样做:(

unique = Union[x]
duplicates = MyExcept[x,unique]

当然,如果 x 有两个以上的重复项 - 例如,{1,2,2,2,3,4,4},则输出将是 {2,2,4},但额外的 Union[] 可以解决这个问题。)

但是没有类似的东西(如果我确实很好地理解了那里的所有函数)。

那么,该怎么做呢?

In Mathematica I have a list:

x = {1,2,3,3,4,5,5,6}

How will I make a list with the duplicates? Like:

{3,5}

I have been looking at Lists as Sets, if there is something like Except[] for lists, so I could do:

unique = Union[x]
duplicates = MyExcept[x,unique]

(Of course, if the x would have more than two duplicates - say, {1,2,2,2,3,4,4}, there the output would be {2,2,4}, but additional Union[] would solve this.)

But there wasn't anything like that (if I did understand all the functions there well).

So, how to do that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

哆啦不做梦 2024-08-15 15:06:15

有很多方法可以像这样进行列表提取;这是我想到的第一件事:

Part[Select[Tally@x, Part[#, 2] > 1 &], All, 1]

或者,更易读的片段:

Tally@x
Select[%, Part[#, 2] > 1 &]
Part[%, All, 1]

这分别给出,

{{1, 1}, {2, 1}, {3, 2}, {4, 1}, {5, 2}, {6, 1}}
{{3, 2}, {5, 2}}
{3, 5}

也许你可以想到一种更有效(在时间或代码空间上)的方式 :)

顺便说一句,如果列表未排序,那么您需要先对其运行 Sort 才能正常工作。

Lots of ways to do list extraction like this; here's the first thing that came to my mind:

Part[Select[Tally@x, Part[#, 2] > 1 &], All, 1]

Or, more readably in pieces:

Tally@x
Select[%, Part[#, 2] > 1 &]
Part[%, All, 1]

which gives, respectively,

{{1, 1}, {2, 1}, {3, 2}, {4, 1}, {5, 2}, {6, 1}}
{{3, 2}, {5, 2}}
{3, 5}

Perhaps you can think of a more efficient (in time or code space) way :)

By the way, if the list is unsorted then you need run Sort on it first before this will work.

同展鸳鸯锦 2024-08-15 15:06:15

以下是一次通过列表即可完成此操作的方法:

collectDups[l_] := Block[{i}, i[n_]:= (i[n] = n; Unevaluated@Sequence[]); i /@ l]

例如:

collectDups[{1, 1, 6, 1, 3, 4, 4, 5, 4, 4, 2, 2}] --> {1, 1, 4, 4, 4, 2}

如果您想要唯一重复项的列表 - {1, 4, 2} - 则将以上内容包装在 DeleteDuplicates,这是对列表的另一次单次传递(Union 效率较低,因为它还会对结果进行排序)。

collectDups[l_] := 
  DeleteDuplicates@Block[{i}, i[n_]:= (i[n] = n; Unevaluated@Sequence[]); i /@ l]

威尔·罗伯逊的解决方案可能更好,因为它更简单,但我认为如果您想寻求更快的速度,这应该会获胜。但如果您关心这一点,您就不会在 Mathematica 中编程! :)

Here's a way to do it in a single pass through the list:

collectDups[l_] := Block[{i}, i[n_]:= (i[n] = n; Unevaluated@Sequence[]); i /@ l]

For example:

collectDups[{1, 1, 6, 1, 3, 4, 4, 5, 4, 4, 2, 2}] --> {1, 1, 4, 4, 4, 2}

If you want the list of unique duplicates -- {1, 4, 2} -- then wrap the above in DeleteDuplicates, which is another single pass through the list (Union is less efficient as it also sorts the result).

collectDups[l_] := 
  DeleteDuplicates@Block[{i}, i[n_]:= (i[n] = n; Unevaluated@Sequence[]); i /@ l]

Will Robertson's solution is probably better just because it's more straightforward, but I think if you wanted to eek out more speed, this should win. But if you cared about that, you wouldn't be programming in Mathematica! :)

水溶 2024-08-15 15:06:15

以下是 Tally 方法的几种更快的变体。

f4 使用了 Carl Woll 和 Oliver Ruebenkoenig 在 MathGroup 上给出的“技巧”。

f2 = Tally@# /. {{_, 1} :> Sequence[], {a_, _} :> a} &;

f3 = Pick[#, Unitize[#2 - 1], 1] & @@ Transpose@Tally@# &;

f4 = # ~Extract~ SparseArray[Unitize[#2 - 1]]["NonzeroPositions"] & @@ Transpose@Tally@# &;

速度比较(f1 仅供参考)

a = RandomInteger[100000, 25000];

f1 = Part[Select[Tally@#, Part[#, 2] > 1 &], All, 1] &;

First@Timing@Do[#@a, {50}] & /@ {f1, f2, f3, f4, Tally}

SameQ @@ (#@a &) /@ {f1, f2, f3, f4}

Out[]= {3.188, 1.296, 0.719, 0.375, 0.36}

Out[]= True

令我惊讶的是,f4 相对于纯 Tally 几乎没有任何开销!

Here are several faster variations of the Tally method.

f4 uses "tricks" given by Carl Woll and Oliver Ruebenkoenig on MathGroup.

f2 = Tally@# /. {{_, 1} :> Sequence[], {a_, _} :> a} &;

f3 = Pick[#, Unitize[#2 - 1], 1] & @@ Transpose@Tally@# &;

f4 = # ~Extract~ SparseArray[Unitize[#2 - 1]]["NonzeroPositions"] & @@ Transpose@Tally@# &;

Speed comparison (f1 included for reference)

a = RandomInteger[100000, 25000];

f1 = Part[Select[Tally@#, Part[#, 2] > 1 &], All, 1] &;

First@Timing@Do[#@a, {50}] & /@ {f1, f2, f3, f4, Tally}

SameQ @@ (#@a &) /@ {f1, f2, f3, f4}

Out[]= {3.188, 1.296, 0.719, 0.375, 0.36}

Out[]= True

It is amazing to me that f4 has almost no overhead relative to a pure Tally!

撩人痒 2024-08-15 15:06:15

使用像 dreeves 这样的解决方案,但只返回每个重复元素的单个实例,有点棘手。一种方法如下:

collectDups1[l_] :=
  Module[{i, j},
    i[n_] := (i[n] := j[n]; Unevaluated@Sequence[]);
    j[n_] := (j[n] = Unevaluated@Sequence[]; n);
    i /@ l];

这与 Will Robertson(IMO 高级)解决方案产生的输出并不完全匹配,因为元素将按照可以确定它们重复的顺序出现在返回的列表中。我不确定它是否真的可以在一次传递中完成,我能想到的所有方法实际上都涉及至少两次传递,尽管一次可能只针对重复的元素。

Using a solution like dreeves, but only returning a single instance of each duplicated element, is a bit on the tricky side. One way of doing it is as follows:

collectDups1[l_] :=
  Module[{i, j},
    i[n_] := (i[n] := j[n]; Unevaluated@Sequence[]);
    j[n_] := (j[n] = Unevaluated@Sequence[]; n);
    i /@ l];

This doesn't precisely match the output produced by Will Robertson's (IMO superior) solution, because elements will appear in the returned list in the order that it can be determined that they're duplicates. I'm not sure if it really can be done in a single pass, all the ways I can think of involve, in effect, at least two passes, although one might only be over the duplicated elements.

孤单情人 2024-08-15 15:06:15

这是 Robertson 答案的一个版本,它使用 100%“后缀表示法”进行函数调用。

identifyDuplicates[list_List, test_:SameQ] :=
 list //
    Tally[#, test] & //
   Select[#, #[[2]] > 1 &] & //
  Map[#[[1]] &, #] &

Mathematica 的 // 类似于其他语言中方法调用的点。例如,如果这是用 C# / LINQ 风格编写的,则它类似于

list.Tally(test).Where(x => x[2] > 1).Select(x => x[1])

注意,C# 的 Where 类似于 MMA 的 Select,而 C# 的 Select 是就像 MMA 的地图

编辑:添加了可选的测试函数参数,默认为 SameQ。

编辑:这是一个解决我的评论的版本&报告给定投影仪函数的组中的所有等效项,该投影仪函数生成一个值,如果该值相等,则列表中的元素被视为等效。这本质上会找到比给定大小长的等价类:

reportDuplicateClusters[list_List, projector_: (# &), 
  minimumClusterSize_: 2] :=
 GatherBy[list, projector] //
  Select[#, Length@# >= minimumClusterSize &] &

下面是一个示例,用于检查其第一个元素上的整数对,如果它们的第一个元素相等,则认为两对相等

reportDuplicateClusters[RandomInteger[10, {10, 2}], #[[1]] &]

Here is a version of Robertson's answer that uses 100% "postfix notation" for function calls.

identifyDuplicates[list_List, test_:SameQ] :=
 list //
    Tally[#, test] & //
   Select[#, #[[2]] > 1 &] & //
  Map[#[[1]] &, #] &

Mathematica's // is similar to the dot for method calls in other languages. For instance, if this were written in C# / LINQ style, it would resemble

list.Tally(test).Where(x => x[2] > 1).Select(x => x[1])

Note that C#'s Where is like MMA's Select, and C#'s Select is like MMA's Map.

EDIT: added optional test function argument, defaulting to SameQ.

EDIT: here is a version that addresses my comment below & reports all the equivalents in a group given a projector function that produces a value such that elements of the list are considered equivalent if the value is equal. This essentially finds equivalence classes longer than a given size:

reportDuplicateClusters[list_List, projector_: (# &), 
  minimumClusterSize_: 2] :=
 GatherBy[list, projector] //
  Select[#, Length@# >= minimumClusterSize &] &

Here is a sample that checks pairs of integers on their first elements, considering two pairs equivalent if their first elements are equal

reportDuplicateClusters[RandomInteger[10, {10, 2}], #[[1]] &]
淡淡の花香 2024-08-15 15:06:15

该线程似乎很旧,但我必须自己解决这个问题。

这有点粗糙,但是这样做可以吗?

Union[Select[Table[If[tt[[n]] == tt[[n + 1]], tt[[n]], ""], {n, Length[tt] - 1}], IntegerQ]]

This thread seems old, but I've had to solve this myself.

This is kind of crude, but does this do it?

Union[Select[Table[If[tt[[n]] == tt[[n + 1]], tt[[n]], ""], {n, Length[tt] - 1}], IntegerQ]]
生生漫 2024-08-15 15:06:15

另一种短期的可能性是

Last /@ Select[Gather[x], Length[#] > 1 &]

Another short possibility is

Last /@ Select[Gather[x], Length[#] > 1 &]
通知家属抬走 2024-08-15 15:06:15

给定一个列表 A,
获取 B 中的非重复值
B = 删除重复项[A]
获取 C 中的重复值
C = 求补 [A,B]
从 D 中的重复列表中获取非重复值
D = DeleteDuplicates[C]

因此,对于您的示例:
A = 1, 2, 2, 2, 3, 4, 4
B = 1, 2, 3, 4
C = 2, 2, 4
D = 2, 4

所以你的答案将是DeleteDuplicates[Complement[x,DeleteDuplicates[x]]],其中 x 是你的列表。我不懂数学,所以这里的语法可能完美也可能不完美。只需查看您链接到的页面上的文档即可。

Given a list A,
get the non-duplicate values in B
B = DeleteDuplicates[A]
get the duplicate values in C
C = Complement[A,B]
get the non-duplicate values from the duplicate list in D
D = DeleteDuplicates[C]

So for your example:
A = 1, 2, 2, 2, 3, 4, 4
B = 1, 2, 3, 4
C = 2, 2, 4
D = 2, 4

so your answer would be DeleteDuplicates[Complement[x,DeleteDuplicates[x]]] where x is your list. I don't know mathematica, so the syntax may or may not be perfect here. Just going by the docs on the page you linked to.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文