使用 Javascript 数组计算集合差异的最快或最优雅的方法是什么?

发布于 2024-08-11 04:12:39 字数 466 浏览 7 评论 0 原文

AB 为两个集合。我正在寻找真正快速或优雅的方法来计算集合差异(A - BA \B,具体取决于您的偏好)他们之间。正如标题所示,这两个集合作为 Javascript 数组进行存储和操作。

注意:

  • Gecko 特定的技巧还可以,
  • 我更喜欢坚持使用本机函数(但如果速度更快,我愿意使用轻量级库)
  • 我已经看到过,但没有测试过 JS.Set (参见上一点)

编辑: 我注意到有关包含重复元素的集合的评论。当我说“集合”时,我指的是数学定义,这意味着(除其他外)它们不包含重复元素。

Let A and B be two sets. I'm looking for really fast or elegant ways to compute the set difference (A - B or A \B, depending on your preference) between them. The two sets are stored and manipulated as Javascript arrays, as the title says.

Notes:

  • Gecko-specific tricks are okay
  • I'd prefer sticking to native functions (but I am open to a lightweight library if it's way faster)
  • I've seen, but not tested, JS.Set (see previous point)

Edit: I noticed a comment about sets containing duplicate elements. When I say "set" I'm referring to the mathematical definition, which means (among other things) that they do not contain duplicate elements.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

南城旧梦 2024-08-18 04:12:39

我不知道这是否最有效,但也许是最短的:

var A = [1, 2, 3, 4];
var B = [1, 3, 4, 7];

var diff = A.filter(function(x) {
  return B.indexOf(x) < 0;
});

console.log(diff); // [2]

更新到ES6:

const A = [1, 2, 3, 4];
const B = [1, 3, 4, 7];

const diff = A.filter(x => !B.includes(x));

console.log(diff); // [2]

I don't know if this is most effective, but perhaps the shortest:

var A = [1, 2, 3, 4];
var B = [1, 3, 4, 7];

var diff = A.filter(function(x) {
  return B.indexOf(x) < 0;
});

console.log(diff); // [2]

Updated to ES6:

const A = [1, 2, 3, 4];
const B = [1, 3, 4, 7];

const diff = A.filter(x => !B.includes(x));

console.log(diff); // [2]

无所的.畏惧 2024-08-18 04:12:39

好吧,7 年后,有了 ES6 的 Set 对象相当简单(但仍然不如 python A - B),据报道对于大型数组比 indexOf 更快:

console.clear();

let a = new Set([1, 2, 3, 4]);
let b = new Set([5, 4, 3, 2]);

let a_minus_b = new Set([...a].filter(x => !b.has(x)));
let b_minus_a = new Set([...b].filter(x => !a.has(x)));
let a_intersect_b = new Set([...a].filter(x => b.has(x))); 
let a_union_b = new Set([...a, ...b]); 

console.log(...a_minus_b);     // {1}
console.log(...b_minus_a);     // {5}
console.log(...a_intersect_b); // {2,3,4}
console.log(...a_union_b);     // {1,2,3,4,5}

Well, 7 years later, with ES6's Set object it's quite easy (but still not as compact as python's A - B), and reportedly faster than indexOf for large arrays:

console.clear();

let a = new Set([1, 2, 3, 4]);
let b = new Set([5, 4, 3, 2]);

let a_minus_b = new Set([...a].filter(x => !b.has(x)));
let b_minus_a = new Set([...b].filter(x => !a.has(x)));
let a_intersect_b = new Set([...a].filter(x => b.has(x))); 
let a_union_b = new Set([...a, ...b]); 

console.log(...a_minus_b);     // {1}
console.log(...b_minus_a);     // {5}
console.log(...a_intersect_b); // {2,3,4}
console.log(...a_union_b);     // {1,2,3,4,5}

巨坚强 2024-08-18 04:12:39

看看这些解决方案中的许多,它们对于小情况来说效果很好。但是,当你将它们增加到一百万个项目时,时间复杂度开始变得愚蠢。

 A.filter(v => B.includes(v))

这看起来像是一个 O(N^2) 的解决方案。既然有一个 O(N) 解决方案,让我们使用它,如果您的 JS 运行时不是最新的,您可以轻松修改为不成为生成器。

    function *setMinus(A, B) {
      const setA = new Set(A);
      const setB = new Set(B);

      for (const v of setB.values()) {
        if (!setA.delete(v)) {
            yield v;
        }
      }

      for (const v of setA.values()) {
        yield v;
      }
    }

    a = [1,2,3];
    b = [2,3,4];

    console.log(Array.from(setMinus(a, b)));

虽然这比许多其他解决方案要复杂一些,但当您有大型列表时,这会快得多。

让我们快速看一下性能差异,在 0...10,000 之间的一组 1,000,000 个随机整数上运行,我们会看到以下性能结果。

setMinus time =  181 ms
    diff time =  19099 ms

function buildList(count, range) {
  result = [];
  for (i = 0; i < count; i++) {
    result.push(Math.floor(Math.random() * range))
  }
  return result;
}

function *setMinus(A, B) {
  const setA = new Set(A);
  const setB = new Set(B);

  for (const v of setB.values()) {
    if (!setA.delete(v)) {
        yield v;
    }
  }

  for (const v of setA.values()) {
    yield v;
  }
}

function doDiff(A, B) {
  return A.filter(function(x) { return B.indexOf(x) < 0 })
}

const listA = buildList(100_000, 100_000_000); 
const listB = buildList(100_000, 100_000_000); 

let t0 = process.hrtime.bigint()

const _x = Array.from(setMinus(listA, listB))

let t1 = process.hrtime.bigint()

const _y = doDiff(listA, listB)

let t2 = process.hrtime.bigint()

console.log("setMinus time = ", (t1 - t0) / 1_000_000n, "ms");
console.log("diff time = ", (t2 - t1) / 1_000_000n, "ms");

Looking at a lof of these solutions, they do fine for small cases. But, when you blow them up to a million items, the time complexity starts getting silly.

 A.filter(v => B.includes(v))

That starts looking like an O(N^2) solution. Since there is an O(N) solution, let's use it, you can easily modify to not be a generator if you're not up to date on your JS runtime.

    function *setMinus(A, B) {
      const setA = new Set(A);
      const setB = new Set(B);

      for (const v of setB.values()) {
        if (!setA.delete(v)) {
            yield v;
        }
      }

      for (const v of setA.values()) {
        yield v;
      }
    }

    a = [1,2,3];
    b = [2,3,4];

    console.log(Array.from(setMinus(a, b)));

While this is a bit more complex than many of the other solutions, when you have large lists this will be far faster.

Let's take a quick look at the performance difference, running it on a set of 1,000,000 random integers between 0...10,000 we see the following performance results.

setMinus time =  181 ms
    diff time =  19099 ms

function buildList(count, range) {
  result = [];
  for (i = 0; i < count; i++) {
    result.push(Math.floor(Math.random() * range))
  }
  return result;
}

function *setMinus(A, B) {
  const setA = new Set(A);
  const setB = new Set(B);

  for (const v of setB.values()) {
    if (!setA.delete(v)) {
        yield v;
    }
  }

  for (const v of setA.values()) {
    yield v;
  }
}

function doDiff(A, B) {
  return A.filter(function(x) { return B.indexOf(x) < 0 })
}

const listA = buildList(100_000, 100_000_000); 
const listB = buildList(100_000, 100_000_000); 

let t0 = process.hrtime.bigint()

const _x = Array.from(setMinus(listA, listB))

let t1 = process.hrtime.bigint()

const _y = doDiff(listA, listB)

let t2 = process.hrtime.bigint()

console.log("setMinus time = ", (t1 - t0) / 1_000_000n, "ms");
console.log("diff time = ", (t2 - t1) / 1_000_000n, "ms");

少年亿悲伤 2024-08-18 04:12:39

如果您使用 Set,它可能非常简单且高性能:

function setDifference(a, b) {
  return new Set(Array.from(a).filter(item => !b.has(item)));
}

由于 Set 在底层使用哈希函数*,因此 has函数比 indexOf 快得多(如果您有超过 100 个项目,这一点很重要)。

If you're using Sets, it can be quite simple and performant:

function setDifference(a, b) {
  return new Set(Array.from(a).filter(item => !b.has(item)));
}

Since Sets use Hash functions* under the hood, the has function is much faster than indexOf (this matters if you have, say, more than 100 items).

仲春光 2024-08-18 04:12:39

您可以使用对象作为映射,以避免线性扫描 B 中的每个元素 A,如 user187291 的答案

function setMinus(A, B) {
    var map = {}, C = [];

    for(var i = B.length; i--; )
        map[B[i].toSource()] = null; // any other value would do

    for(var i = A.length; i--; ) {
        if(!map.hasOwnProperty(A[i].toSource()))
            C.push(A[i]);
    }

    return C;
}

非标准 toSource()方法用于获取唯一的属性名称;如果所有元素已经具有唯一的字符串表示形式(与数字的情况一样),您可以通过删除 toSource() 调用来加快代码速度。

You can use an object as a map to avoid linearly scanning B for each element of A as in user187291's answer:

function setMinus(A, B) {
    var map = {}, C = [];

    for(var i = B.length; i--; )
        map[B[i].toSource()] = null; // any other value would do

    for(var i = A.length; i--; ) {
        if(!map.hasOwnProperty(A[i].toSource()))
            C.push(A[i]);
    }

    return C;
}

The non-standard toSource() method is used to get unique property names; if all elements already have unique string representations (as is the case with numbers), you can speed up the code by dropping the toSource() invocations.

浅浅淡淡 2024-08-18 04:12:39

使用 jQuery 最短的是:

var A = [1, 2, 3, 4];
var B = [1, 3, 4, 7];

var diff = $(A).not(B);

console.log(diff.toArray());
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

The shortest, using jQuery, is:

var A = [1, 2, 3, 4];
var B = [1, 3, 4, 7];

var diff = $(A).not(B);

console.log(diff.toArray());
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

只是在用心讲痛 2024-08-18 04:12:39

一些简单的功能,借用@milan的答案:

const setDifference = (a, b) => new Set([...a].filter(x => !b.has(x)));
const setIntersection = (a, b) => new Set([...a].filter(x => b.has(x)));
const setUnion = (a, b) => new Set([...a, ...b]);

用法:

const a = new Set([1, 2]);
const b = new Set([2, 3]);

setDifference(a, b); // Set { 1 }
setIntersection(a, b); // Set { 2 }
setUnion(a, b); // Set { 1, 2, 3 }

Some simple functions, borrowing from @milan's answer:

const setDifference = (a, b) => new Set([...a].filter(x => !b.has(x)));
const setIntersection = (a, b) => new Set([...a].filter(x => b.has(x)));
const setUnion = (a, b) => new Set([...a, ...b]);

Usage:

const a = new Set([1, 2]);
const b = new Set([2, 3]);

setDifference(a, b); // Set { 1 }
setIntersection(a, b); // Set { 2 }
setUnion(a, b); // Set { 1, 2, 3 }
时光病人 2024-08-18 04:12:39

我会对数组 B 进行哈希处理,然后保留数组 A 中不存在于 B 中的值:

function getHash(array){
  // Hash an array into a set of properties
  //
  // params:
  //   array - (array) (!nil) the array to hash
  //
  // return: (object)
  //   hash object with one property set to true for each value in the array

  var hash = {};
  for (var i=0; i<array.length; i++){
    hash[ array[i] ] = true;
  }
  return hash;
}

function getDifference(a, b){
  // compute the difference a\b
  //
  // params:
  //   a - (array) (!nil) first array as a set of values (no duplicates)
  //   b - (array) (!nil) second array as a set of values (no duplicates)
  //
  // return: (array)
  //   the set of values (no duplicates) in array a and not in b, 
  //   listed in the same order as in array a.

  var hash = getHash(b);
  var diff = [];
  for (var i=0; i<a.length; i++){
    var value = a[i];
    if ( !hash[value]){
      diff.push(value);
    }
  }
  return diff;
}

I would hash the array B, then keep values from the array A not present in B:

function getHash(array){
  // Hash an array into a set of properties
  //
  // params:
  //   array - (array) (!nil) the array to hash
  //
  // return: (object)
  //   hash object with one property set to true for each value in the array

  var hash = {};
  for (var i=0; i<array.length; i++){
    hash[ array[i] ] = true;
  }
  return hash;
}

function getDifference(a, b){
  // compute the difference a\b
  //
  // params:
  //   a - (array) (!nil) first array as a set of values (no duplicates)
  //   b - (array) (!nil) second array as a set of values (no duplicates)
  //
  // return: (array)
  //   the set of values (no duplicates) in array a and not in b, 
  //   listed in the same order as in array a.

  var hash = getHash(b);
  var diff = [];
  for (var i=0; i<a.length; i++){
    var value = a[i];
    if ( !hash[value]){
      diff.push(value);
    }
  }
  return diff;
}
春夜浅 2024-08-18 04:12:39

使用 Underscore.js (函数式 JS 库)

>>> var foo = [1,2,3]
>>> var bar = [1,2,4]
>>> _.difference(foo, bar);
[4]

Using Underscore.js (Library for functional JS)

>>> var foo = [1,2,3]
>>> var bar = [1,2,4]
>>> _.difference(foo, bar);
[4]
傲娇萝莉攻 2024-08-18 04:12:39

更新响应

截至 2024 年 6 月,ECMAScript TC39 set 方法提案 处于第 3 阶段(候选人)。

更新:截至 2024 年 7 月 6 日,该提案处于第 4 阶段(草案)。

const
  a = new Set([1, 2, 3, 4]),
  b = new Set([5, 4, 3, 2]);

console.log(...[...a.union(b)]);                // [1, 2, 3, 4, 5]
console.log(...[...a.intersection(b)]);         // [2, 3, 4]
console.log(...[...a.difference(b)]);           // [1]
console.log(...[...b.difference(a)]);           // [5]
console.log(...[...a.symmetricDifference(b)]);  // [1, 5]

const
  c = new Set(['A', 'B', 'C', 'D', 'E']),
  d = new Set(['B', 'D']);
  
console.log(d.isSubsetOf(c));                   // true
console.log(c.isSupersetOf(d));                 // true

const
  e = new Set(['A', 'B', 'C']),
  f = new Set(['X', 'Y', 'Z']);
  
console.log(e.isDisjointFrom(f));               // true
.as-console-wrapper { top: 0; max-height: 100% !important; }

截至 2024 年 6 月 11 日,所有主要浏览器均支持以下方法。

类型 名称 版本 日期
Desktop Chrome 122 2024-02-20
Desktop Edge 122 2024-02-23
Desktop Firefox 127 2024-06-11
Desktop Opera 108 2024-03- 05
桌面版 Safari 17 2023-09-18
移动版 Chrome Android 122 2024-02-20
适用于 Android 的移动 版 Firefox 127 2024-06-11
移动版 Opera Android 81 2024-03-14
移动 iOS 上的 版 Safari 17 2023-09-18
移动版 Samsung Internet - -
Mobile WebView Android 122 2024-02-20
Server Deno 1.42 2024-03-28
Server Node.js 22.0.0 2024-04-25
其他 TypeScript 5.5 2024-06-20

原始响应

下面的函数是找到的方法的端口在Python的 set() 类中遵循 TC39 Set 方法提案

const
  union = (a, b) => new Set([...a, ...b]),
  intersection = (a, b) => new Set([...a].filter(x => b.has(x))),
  difference = (a, b) => new Set([...a].filter(x => !b.has(x))),
  symmetricDifference = (a, b) => union(difference(a, b), difference(b, a)),
  isSubsetOf = (a, b) => [...b].every(x => a.has(x)),
  isSupersetOf = (a, b) => [...a].every(x => b.has(x)),
  isDisjointFrom = (a, b) => !intersection(a, b).size;

const
  a = new Set([1, 2, 3, 4]),
  b = new Set([5, 4, 3, 2]);

console.log(...union(a, b));                // [1, 2, 3, 4, 5]
console.log(...intersection(a, b));         // [2, 3, 4]
console.log(...difference(b, a));           // [1]
console.log(...difference(a, b));           // [5]
console.log(...symmetricDifference(a, b));  // [1, 5]

const
  c = new Set(['A', 'B', 'C', 'D', 'E']),
  d = new Set(['B', 'D']);
  
console.log(isSubsetOf(c, d));              // true
console.log(isSupersetOf(d, c));            // true

const
  e = new Set(['A', 'B', 'C']),
  f = new Set(['X', 'Y', 'Z']);
  
console.log(isDisjointFrom(e, f));          // true
.as-console-wrapper { top: 0; max-height: 100% !important; }

Updated response

As of June 2024, the ECMAScript TC39 proposal for set methods is in Stage 3 (Candidate).

Update: as of July 6th, 2024, the proposal is in Stage 4 (Draft).

const
  a = new Set([1, 2, 3, 4]),
  b = new Set([5, 4, 3, 2]);

console.log(...[...a.union(b)]);                // [1, 2, 3, 4, 5]
console.log(...[...a.intersection(b)]);         // [2, 3, 4]
console.log(...[...a.difference(b)]);           // [1]
console.log(...[...b.difference(a)]);           // [5]
console.log(...[...a.symmetricDifference(b)]);  // [1, 5]

const
  c = new Set(['A', 'B', 'C', 'D', 'E']),
  d = new Set(['B', 'D']);
  
console.log(d.isSubsetOf(c));                   // true
console.log(c.isSupersetOf(d));                 // true

const
  e = new Set(['A', 'B', 'C']),
  f = new Set(['X', 'Y', 'Z']);
  
console.log(e.isDisjointFrom(f));               // true
.as-console-wrapper { top: 0; max-height: 100% !important; }

All major browsers now support the following methods as of June 11, 2024.

Type Name Version Date
Desktop Chrome 122 2024-02-20
Desktop Edge 122 2024-02-23
Desktop Firefox 127 2024-06-11
Desktop Opera 108 2024-03-05
Desktop Safari 17 2023-09-18
Mobile Chrome Android 122 2024-02-20
Mobile Firefox for Android 127 2024-06-11
Mobile Opera Android 81 2024-03-14
Mobile Safari on iOS 17 2023-09-18
Mobile Samsung Internet - -
Mobile WebView Android 122 2024-02-20
Server Deno 1.42 2024-03-28
Server Node.js 22.0.0 2024-04-25
Other TypeScript 5.5 2024-06-20

Original response

The function below are ports of the methods found in Python's set() class and follows the TC39 Set methods proposal.

const
  union = (a, b) => new Set([...a, ...b]),
  intersection = (a, b) => new Set([...a].filter(x => b.has(x))),
  difference = (a, b) => new Set([...a].filter(x => !b.has(x))),
  symmetricDifference = (a, b) => union(difference(a, b), difference(b, a)),
  isSubsetOf = (a, b) => [...b].every(x => a.has(x)),
  isSupersetOf = (a, b) => [...a].every(x => b.has(x)),
  isDisjointFrom = (a, b) => !intersection(a, b).size;

const
  a = new Set([1, 2, 3, 4]),
  b = new Set([5, 4, 3, 2]);

console.log(...union(a, b));                // [1, 2, 3, 4, 5]
console.log(...intersection(a, b));         // [2, 3, 4]
console.log(...difference(b, a));           // [1]
console.log(...difference(a, b));           // [5]
console.log(...symmetricDifference(a, b));  // [1, 5]

const
  c = new Set(['A', 'B', 'C', 'D', 'E']),
  d = new Set(['B', 'D']);
  
console.log(isSubsetOf(c, d));              // true
console.log(isSupersetOf(d, c));            // true

const
  e = new Set(['A', 'B', 'C']),
  f = new Set(['X', 'Y', 'Z']);
  
console.log(isDisjointFrom(e, f));          // true
.as-console-wrapper { top: 0; max-height: 100% !important; }

妖妓 2024-08-18 04:12:39

结合 Christoph 的想法,并假设对数组和对象/散列(each 和朋友)使用一些非标准迭代方法,我们可以在大约 20 行的线性时间内获得集合差、并集和交集总计:

var setOPs = {
  minusAB : function (a, b) {
    var h = {};
    b.each(function (v) { h[v] = true; });
    return a.filter(function (v) { return !h.hasOwnProperty(v); });
  },
  unionAB : function (a, b) {
    var h = {}, f = function (v) { h[v] = true; };
    a.each(f);
    b.each(f);
    return myUtils.keys(h);
  },
  intersectAB : function (a, b) {
    var h = {};
    a.each(function (v) { h[v] = 1; });
    b.each(function (v) { h[v] = (h[v] || 0) + 1; });
    var fnSel = function (v, count) { return count > 1; };
    var fnVal = function (v, c) { return v; };
    return myUtils.select(h, fnSel, fnVal);
  }
};

假设为数组定义了 eachfilter,并且我们有两个实用方法:

  • myUtils.keys(hash):返回一个
    带有哈希键的数组

  • myUtils.select(hash, fnSelector,
    fnEvaluator)
    :返回一个数组
    调用fnEvaluator的结果
    在键/值对上
    fnSelector 返回 true。

select() 受到 Common Lisp 的松散启发,只是将 filter()map() 合二为一。 (最好在 Object.prototype 上定义它们,但这样做会对 jQuery 造成严重破坏,所以我选择了静态实用方法。)

性能:测试

var a = [], b = [];
for (var i = 100000; i--; ) {
  if (i % 2 !== 0) a.push(i);
  if (i % 3 !== 0) b.push(i);
}

给出了两个包含 50,000 和 66,666 个元素的集合。对于这些值,AB 大约需要 75 毫秒,而并集和交集各大约需要 150 毫秒。 (Mac Safari 4.0,使用 Javascript Date 进行计时。)

我认为这对于 20 行代码来说是不错的回报。

Incorporating the idea from Christoph and assuming a couple of non-standard iteration methods on arrays and objects/hashes (each and friends), we can get set difference, union and intersection in linear time in about 20 lines total:

var setOPs = {
  minusAB : function (a, b) {
    var h = {};
    b.each(function (v) { h[v] = true; });
    return a.filter(function (v) { return !h.hasOwnProperty(v); });
  },
  unionAB : function (a, b) {
    var h = {}, f = function (v) { h[v] = true; };
    a.each(f);
    b.each(f);
    return myUtils.keys(h);
  },
  intersectAB : function (a, b) {
    var h = {};
    a.each(function (v) { h[v] = 1; });
    b.each(function (v) { h[v] = (h[v] || 0) + 1; });
    var fnSel = function (v, count) { return count > 1; };
    var fnVal = function (v, c) { return v; };
    return myUtils.select(h, fnSel, fnVal);
  }
};

This assumes that each and filter are defined for arrays, and that we have two utility methods:

  • myUtils.keys(hash): returns an
    array with the keys of the hash

  • myUtils.select(hash, fnSelector,
    fnEvaluator)
    : returns an array with
    the results of calling fnEvaluator
    on the key/value pairs for which
    fnSelector returns true.

The select() is loosely inspired by Common Lisp, and is merely filter() and map() rolled into one. (It would be better to have them defined on Object.prototype, but doing so wrecks havoc with jQuery, so I settled for static utility methods.)

Performance: Testing with

var a = [], b = [];
for (var i = 100000; i--; ) {
  if (i % 2 !== 0) a.push(i);
  if (i % 3 !== 0) b.push(i);
}

gives two sets with 50,000 and 66,666 elements. With these values A-B takes about 75ms, while union and intersection are about 150ms each. (Mac Safari 4.0, using Javascript Date for timing.)

I think that's decent payoff for 20 lines of code.

我家小可爱 2024-08-18 04:12:39

至于禁食方式,这不是那么优雅,但我已经进行了一些测试来确定。将一个数组作为对象加载可以更快地进行大量处理:

var t, a, b, c, objA;

    // Fill some arrays to compare
a = Array(30000).fill(0).map(function(v,i) {
    return i.toFixed();
});
b = Array(20000).fill(0).map(function(v,i) {
    return (i*2).toFixed();
});

    // Simple indexOf inside filter
t = Date.now();
c = b.filter(function(v) { return a.indexOf(v) < 0; });
console.log('completed indexOf in %j ms with result %j length', Date.now() - t, c.length);

    // Load `a` as Object `A` first to avoid indexOf in filter
t = Date.now();
objA = {};
a.forEach(function(v) { objA[v] = true; });
c = b.filter(function(v) { return !objA[v]; });
console.log('completed Object in %j ms with result %j length', Date.now() - t, c.length);

结果:

completed indexOf in 1219 ms with result 5000 length
completed Object in 8 ms with result 5000 length

但是,这仅适用于字符串。如果您打算比较编号集,您将需要使用 parseFloat 映射结果。

As for the fasted way, this isn't so elegant but I've run some tests to be sure. Loading one array as an object is far faster to process in large quantities:

var t, a, b, c, objA;

    // Fill some arrays to compare
a = Array(30000).fill(0).map(function(v,i) {
    return i.toFixed();
});
b = Array(20000).fill(0).map(function(v,i) {
    return (i*2).toFixed();
});

    // Simple indexOf inside filter
t = Date.now();
c = b.filter(function(v) { return a.indexOf(v) < 0; });
console.log('completed indexOf in %j ms with result %j length', Date.now() - t, c.length);

    // Load `a` as Object `A` first to avoid indexOf in filter
t = Date.now();
objA = {};
a.forEach(function(v) { objA[v] = true; });
c = b.filter(function(v) { return !objA[v]; });
console.log('completed Object in %j ms with result %j length', Date.now() - t, c.length);

Results:

completed indexOf in 1219 ms with result 5000 length
completed Object in 8 ms with result 5000 length

However, this works with strings only. If you plan to compare numbered sets you'll want to map results with parseFloat.

oО清风挽发oО 2024-08-18 04:12:39

这个可行,但我认为另一个更短,也更优雅

A = [1, 'a', 'b', 12];
B = ['a', 3, 4, 'b'];

diff_set = {
    ar : {},
    diff : Array(),
    remove_set : function(a) { ar = a; return this; },
    remove: function (el) {
        if(ar.indexOf(el)<0) this.diff.push(el);
    }
}

A.forEach(diff_set.remove_set(B).remove,diff_set);
C = diff_set.diff;

This works, but I think another one is much more shorter, and elegant too

A = [1, 'a', 'b', 12];
B = ['a', 3, 4, 'b'];

diff_set = {
    ar : {},
    diff : Array(),
    remove_set : function(a) { ar = a; return this; },
    remove: function (el) {
        if(ar.indexOf(el)<0) this.diff.push(el);
    }
}

A.forEach(diff_set.remove_set(B).remove,diff_set);
C = diff_set.diff;
你げ笑在眉眼 2024-08-18 04:12:39

使用 Set.prototype.difference( )

A.difference(B)

理论上时间复杂度应该是θ(n),其中nB<中的元素数量/代码>。

您还可以使用 [core-js] 将其填充到旧环境中:

import "core-js"

Use Set.prototype.difference():

A.difference(B)

In theory, the time complexity should be Θ(n), where n is the number of elements in B.

You can also polyfill this into older environments with [core-js]:

import "core-js"
醉南桥 2024-08-18 04:12:39

@koblas 提供的答案很好,但也返回两个数组中的项目。对我想要获得差异的用例进行了轻微修改(在 ES6 中)(目的是检索 array_j 中的新项目以及 array_i 中的项目) /code> 不在 array j 中作为单独的输出数组,以下是提供的 3 种主要方法:

var arr_i = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"];
var arr_j = ["a", "c", "d", "f", "g", "h", "j", "k", "l", "n"];

答案应该是数组 j 中的新项目,如 ['b' , 'e', 'i'] 以及数组 i 中不在数组 j 中的项 ['k', 'l', 'n']

// Convert to Set
var set_i = new Set(arr_i);
var set_j = new Set(arr_j);

const changes = (arr1, arr2) => {
  // Using Array method
  let turn_on = arr2.filter((x) => !arr1.includes(x));
  let turn_off = arr1.filter((x) => !arr2.includes(x));
  return { turn_on, turn_off };
};

const setChanges = (set1, set2) => {
  // Using Set method
  let turn_on = new Set([...set2].filter((x) => !set1.has(x)));
  let turn_off = new Set([...set1].filter((x) => !set2.has(x)));
  return { turn_on, turn_off };
};

function* setMinus(setA, setB) {
  // Using Set method with generator by @koblas
  for (const v of setB.values()) {
    // .delete returns true if value was already in Set; otherwise false.
    if (!setA.delete(v)) {
      yield v;
    }
  }
}

const changesGenerator = (set1, set2) => {
  let turn_off = Array.from(setMinus(set2, set1));
  let turn_on = Array.from(setMinus(set1, set2));
  return { turn_on, turn_off };
};

所有三个方法返回:

{ turn_on: [ 'k', 'l', 'n' ], turn_off: [ 'b', 'e', 'i' ] }

在随机数组上计时,包括包含 5000 个项目的 [0,10000] 范围内的数字

let arr_i = Array.from({ length: 5000 }, () =>
  Math.floor(Math.random() * 10000)
);
let arr_j = Array.from({ length: 5000 }, () =>
  Math.floor(Math.random() * 10000)
);

var set_i = new Set(arr_i);
var set_j = new Set(arr_j);

console.time("Array method");
changes(arr_i, arr_j);
console.timeEnd("Array method");

console.time("Set method");
setChanges(set_i, set_j);
console.timeEnd("Set method");

console.time("Generator method");
changesGenerator(set_i, set_j);
console.timeEnd("Generator method");

返回:

Array method: 36.894ms
Set method: 1.14ms
Generator method: 2.155ms

所以是的,只需使用:

let set1_minus_set2 = new Set([...set1].filter((x) => !set2.has(x)));

Answer provided by @koblas is good but returns items that are in both arrays aswell. With a slight modification (in ES6) for my use case where I want to get the difference, (with the intention of retrieving new items in array_j, as well as the the items in array_i that are not in array j as separate output arrays, these are the 3 main ways provided to do this:

var arr_i = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"];
var arr_j = ["a", "c", "d", "f", "g", "h", "j", "k", "l", "n"];

The answers should be the new items in array j as ['b', 'e', 'i'] as well as the items in array i that are not in array j as ['k', 'l', 'n']

// Convert to Set
var set_i = new Set(arr_i);
var set_j = new Set(arr_j);

const changes = (arr1, arr2) => {
  // Using Array method
  let turn_on = arr2.filter((x) => !arr1.includes(x));
  let turn_off = arr1.filter((x) => !arr2.includes(x));
  return { turn_on, turn_off };
};

const setChanges = (set1, set2) => {
  // Using Set method
  let turn_on = new Set([...set2].filter((x) => !set1.has(x)));
  let turn_off = new Set([...set1].filter((x) => !set2.has(x)));
  return { turn_on, turn_off };
};

function* setMinus(setA, setB) {
  // Using Set method with generator by @koblas
  for (const v of setB.values()) {
    // .delete returns true if value was already in Set; otherwise false.
    if (!setA.delete(v)) {
      yield v;
    }
  }
}

const changesGenerator = (set1, set2) => {
  let turn_off = Array.from(setMinus(set2, set1));
  let turn_on = Array.from(setMinus(set1, set2));
  return { turn_on, turn_off };
};

All three methods return:

{ turn_on: [ 'k', 'l', 'n' ], turn_off: [ 'b', 'e', 'i' ] }

Timing these on random array including numbers from range [0,10000] containing 5000 items

let arr_i = Array.from({ length: 5000 }, () =>
  Math.floor(Math.random() * 10000)
);
let arr_j = Array.from({ length: 5000 }, () =>
  Math.floor(Math.random() * 10000)
);

var set_i = new Set(arr_i);
var set_j = new Set(arr_j);

console.time("Array method");
changes(arr_i, arr_j);
console.timeEnd("Array method");

console.time("Set method");
setChanges(set_i, set_j);
console.timeEnd("Set method");

console.time("Generator method");
changesGenerator(set_i, set_j);
console.timeEnd("Generator method");

Returns:

Array method: 36.894ms
Set method: 1.14ms
Generator method: 2.155ms

So yeah, just use:

let set1_minus_set2 = new Set([...set1].filter((x) => !set2.has(x)));
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文