Java:两个列表之间的差异

发布于 2024-11-13 09:30:40 字数 1301 浏览 5 评论 0原文

我公司的猫群应用程序跟踪一群猫。它需要定期比较 previousOrdercurrentOrder(每个都是 ArrayList),并通知 cat-wranglers 任何更改。

每只猫都是独一无二的,并且只能在每个列表中出现一次(或根本不出现)。大多数情况下,previousOrdercurrentOrder 列表具有相同的内容和相同的顺序,但可能会发生以下任何情况(从更频繁到不太频繁):

  1. 猫的顺序完全打乱
  2. 猫在列表中单独向上或向下移动
  3. 新猫在车队中的特定点加入
  4. 猫离开车队

这看起来像 编辑距离问题 对我来说。理想情况下,我正在寻找一种算法来确定使 previousOrder 匹配 currentOrder 所需的步骤:

  • Fluffy 移动到位置 12
  • 在位置 37 插入 Snuggles
  • 删除 Mr. Chubbs
  • 等。

该算法还应该识别场景#1,在这种情况下,新订单将被完整传达。

最好的方法是什么?

这个帖子该帖子提出了相似的观点问题,但它们都处理排序列表。我的是有序,但未排序。)

编辑

< strong>Levenshtein 算法是一个很好的建议,但我担心创建矩阵的时间/空间要求。我的主要目标是尽快确定并传达变更。这比查找添加内容并发送“这是新猫,这是当前订单”之类的消息更快。

My company's cat-herding application tracks a convoy of cats. Periodically, it needs to compare previousOrder to currentOrder (each is an ArrayList<Cat>) and notify the cat-wranglers of any changes.

Each cat is unique and can only appear once in each list (or not at all). Most of the time, the previousOrder and currentOrder lists have the same contents, in the same order, but any of the following can happen (from more frequent to less frequent):

  1. The order of cats is scrambled completely
  2. Cats individually move up or down in the list
  3. New cats join, at a specific point in the convoy
  4. Cats leave the convoy

This appears like an edit distance problem to me. Ideally, I am looking for an algorithm that determines the steps required to make previousOrder match currentOrder:

  • MOVE Fluffy to position 12
  • INSERT Snuggles at position 37
  • DELETE Mr. Chubbs
  • etc.

The algorithm should also recognize scenario #1, in which case the new order is communicated in its entirety.

What's the best approach for this?

(This post and that post pose similar questions, but they are both dealing with sorted lists. Mine are ordered, but unsorted.)

EDIT

The Levenshtein algorithm is a great suggestion, but I'm concerned about the time/space requirement of creating a matrix. My main goal is to determine and communicate the changes as quickly as possible. Something that is faster than finding the additions and sending message along the lines of "Here are the new cats, and here is the current order."

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

你怎么敢 2024-11-20 09:30:40

这是我用来合并两个列表(oldnew)的算法。它不是最优雅或最高效的,但对于我使用它的数据来说似乎可以正常工作。

new 是最新更新的数据列表,而 old 是需要转换为 new 的过时列表。该算法对旧列表执行操作 - 相应地删除、移动和插入项目。

for(item in old)
    if (new does not contain item)
        remove item from old

for(item in new)
    if (item exists in old)
        if (position(item, old) == position(item, new))
            continue // next loop iteration
        else
            move old item to position(item, new)
    else
        insert new item into old at position(item, new)

删除都是预先完成的,以使项目在第二个循环中的位置更可预测。

这背后的驱动力是将来自服务器的数据列表与浏览器 DOM 中的 行同步(使用 JavaScript)。之所以需要它,是因为我们不想在数据发生变化时重新绘制整个表;列表之间的差异可能很小,仅影响一两行。它可能不是您正在寻找的数据算法。如果没有,请告诉我,我会删除它。

可能可以为此进行一些优化。但对于我和我正在使用的数据来说,它的性能和可预测性足够。

Here's an algorithm I put together to merge two lists, old and new. It's not the most elegant or efficient, but it seems to work okay for the data I'm using it for.

new is the most updated list of data, and old is the out-of-date list that needs to get transformed into new. The algorithm performs its operations on the old list - removing, moving, and inserting items accordingly.

for(item in old)
    if (new does not contain item)
        remove item from old

for(item in new)
    if (item exists in old)
        if (position(item, old) == position(item, new))
            continue // next loop iteration
        else
            move old item to position(item, new)
    else
        insert new item into old at position(item, new)

The deletions are all done up front to make the positions of the items more predictable in the second loop.

The driving force behind this was to sync a list of data from the server with <table> rows in a browser DOM (using javascript). It was needed because we didn't want to redraw the entire table whenever the data changed; the differences between the lists were likely to be small and only affect one or two rows. It may not be the algorithm you're looking for for your data. If not, let me know and I'll delete this.

There are probably some optimizations that could be made for this. But it is performant and predictable enough for me and the data I'm working with.

美人如玉 2024-11-20 09:30:40

编辑距离度量。

http://www.levenshtein.net/

Levenshtein distance metric.

http://www.levenshtein.net/

你的他你的她 2024-11-20 09:30:40

解决这个问题的一个有效方法是使用动态规划。维基百科有一个密切相关问题的伪代码:计算 Levenshtein 距离

跟踪实际操作并纳入“扰乱”操作应该不会太困难。

An efficient way to solve this is by using dynamic programming. Wikipedia has pseudo-code for a closely related problem: Computing Levenshtein distance.

Keeping track of the actual operations and incorporating the "scramble" operation shouldn't be too difficult.

紫南 2024-11-20 09:30:40

我知道提问者正在寻求 Java 解决方案,但我在寻求用 C# 实现的算法时遇到了这个问题。

这是我的解决方案,它生成简单 IListDifference 值的枚举:ItemAddedDifference、ItemRemovedDifference 或 ItemMovedDifference。

它使用源列表的工作副本逐项建立将其转换为匹配目标列表所需的修改。

public class ListComparer<T>
    {
        public IEnumerable<IListDifference> Compare(IEnumerable<T> source, IEnumerable<T> target)
        {
            var copy = new List<T>(source);

            for (var i = 0; i < target.Count(); i++)
            {
                var currentItemsMatch = false;

                while (!currentItemsMatch)
                {
                    if (i < copy.Count && copy[i].Equals(target.ElementAt(i)))
                    {
                        currentItemsMatch = true;
                    }
                    else if (i == copy.Count())
                    {
                        // the target item's index is at the end of the source list
                        copy.Add(target.ElementAt(i));
                        yield return new ItemAddedDifference { Index = i };
                    }
                    else if (!target.Skip(i).Contains(copy[i]))
                    {
                        // the source item cannot be found in the remainder of the target, therefore
                        // the item in the source has been removed 
                        copy.RemoveAt(i);
                        yield return new ItemRemovedDifference { Index = i };
                    }
                    else if (!copy.Skip(i).Contains(target.ElementAt(i)))
                    {
                        // the target item cannot be found in the remainder of the source, therefore
                        // the item in the source has been displaced by a new item
                        copy.Insert(i, target.ElementAt(i));
                        yield return new ItemAddedDifference { Index = i };
                    }
                    else
                    {
                        // the item in the source has been displaced by an existing item
                        var sourceIndex = i + copy.Skip(i).IndexOf(target.ElementAt(i));
                        copy.Insert(i, copy.ElementAt(sourceIndex));
                        copy.RemoveAt(sourceIndex + 1);
                        yield return new ItemMovedDifference { FromIndex = sourceIndex, ToIndex = i };
                    }
                }
            }

            // Remove anything remaining in the source list
            for (var i = target.Count(); i < copy.Count; i++)
            {
                copy.RemoveAt(i);
                yield return new ItemRemovedDifference { Index = i };
            }
        }
    }

只是注意到这使用了 IEnumerable 上的自定义扩展方法 - 'IndexOf':

public static class EnumerableExtensions
{
    public static int IndexOf<T>(this IEnumerable<T> list, T item)
    {
        for (var i = 0; i < list.Count(); i++)
        {
            if (list.ElementAt(i).Equals(item))
            {
                return i;
            }
        }

        return -1;
    }
}

I know the questioner was seeking a Java solution, but I came across this question whilst seeking an algorithm to implement in C#.

Here's my solution, which generates an enumeration of simple IListDifference values: either ItemAddedDifference, ItemRemovedDifference or ItemMovedDifference.

It uses a working copy of the source list to establish, item by item, what modifications are necessary to transform it to match the target list.

public class ListComparer<T>
    {
        public IEnumerable<IListDifference> Compare(IEnumerable<T> source, IEnumerable<T> target)
        {
            var copy = new List<T>(source);

            for (var i = 0; i < target.Count(); i++)
            {
                var currentItemsMatch = false;

                while (!currentItemsMatch)
                {
                    if (i < copy.Count && copy[i].Equals(target.ElementAt(i)))
                    {
                        currentItemsMatch = true;
                    }
                    else if (i == copy.Count())
                    {
                        // the target item's index is at the end of the source list
                        copy.Add(target.ElementAt(i));
                        yield return new ItemAddedDifference { Index = i };
                    }
                    else if (!target.Skip(i).Contains(copy[i]))
                    {
                        // the source item cannot be found in the remainder of the target, therefore
                        // the item in the source has been removed 
                        copy.RemoveAt(i);
                        yield return new ItemRemovedDifference { Index = i };
                    }
                    else if (!copy.Skip(i).Contains(target.ElementAt(i)))
                    {
                        // the target item cannot be found in the remainder of the source, therefore
                        // the item in the source has been displaced by a new item
                        copy.Insert(i, target.ElementAt(i));
                        yield return new ItemAddedDifference { Index = i };
                    }
                    else
                    {
                        // the item in the source has been displaced by an existing item
                        var sourceIndex = i + copy.Skip(i).IndexOf(target.ElementAt(i));
                        copy.Insert(i, copy.ElementAt(sourceIndex));
                        copy.RemoveAt(sourceIndex + 1);
                        yield return new ItemMovedDifference { FromIndex = sourceIndex, ToIndex = i };
                    }
                }
            }

            // Remove anything remaining in the source list
            for (var i = target.Count(); i < copy.Count; i++)
            {
                copy.RemoveAt(i);
                yield return new ItemRemovedDifference { Index = i };
            }
        }
    }

Just noticed this makes use of a custom extension method on IEnumerable - 'IndexOf':

public static class EnumerableExtensions
{
    public static int IndexOf<T>(this IEnumerable<T> list, T item)
    {
        for (var i = 0; i < list.Count(); i++)
        {
            if (list.ElementAt(i).Equals(item))
            {
                return i;
            }
        }

        return -1;
    }
}
§对你不离不弃 2024-11-20 09:30:40

我最近不得不这样做,除非项目可能存在多次。这件事很复杂,但我能够使用前瞻计数器和其他一些疯狂的东西来做到这一点。它看起来很像 Rob 的解决方案,所以感谢他让我开始!

首先,假设我们想要返回将第一个列表转换为第二个列表的操作列表:

public interface Operation {
    /**
     * Apply the operation to the given list.
     */
    void apply(List<String> keys);
}

并且我们有一些辅助方法来构造操作。实际上,您不需要“移动”操作,甚至还可以进行“交换”(或相反),但这就是我所采用的:

Operation delete(int index) { ... }
Operation insert(int index, String key) { ... }
Operation move(int from, int to) { ... }

现在我们将定义一个特殊的类来保存我们的前瞻counts:

class Counter {
    private Map<String, Integer> counts;

    Counter(List<String> keys) {
        counts = new HashMap<>();

        for (String key : keys) {
            if (counts.containsKey(key)) {
                counts.put(key, counts.get(key) + 1);
            } else {
                counts.put(key, 1);
            }
        }
    }

    public int get(String key) {
        if (!counts.containsKey(key)) {
            return 0;
        }

        return counts.get(key);
    }

    public void dec(String key) {
        counts.put(key, counts.get(key) - 1);
    }
}

还有一个帮助方法来获取列表中下一个键的索引:

int next(List<String> list, int start, String key) {
    for (int i = start; i < list.size(); i++) {
        if (list.get(i).equals(key)) {
            return i;
        }
    }

    throw new RuntimeException("next index not found for " + key);
}

现在我们准备好进行转换:

List<Operation> transform(List<String> from, List<String> to) {
    List<Operation> operations = new ArrayList<>();

    // make our own copy of the first, that we can mutate
    from = new ArrayList<>(from);

    // maintain lookahead counts
    Counter fromCounts = new Counter(from);
    Counter toCounts = new Counter(to);

    // do all our deletes first
    for (int i = 0; i < from.size(); i++) {
        String current = from.get(i);

        if (fromCounts.get(current) > toCounts.get(current)) {
            Operation op = delete(i);
            operations.add(op);
            op.apply(from);
            fromCounts.dec(current);
            i--;
        }
    }

    // then one more iteration for the inserts and moves
    for (int i = 0; i < to.size(); i++) {
        String current = to.get(i);

        if (from.size() > i && from.get(i).equals(current)) {
            fromCounts.dec(current);
            continue;
        }

        if (fromCounts.get(current) > 0) {
            Operation op = move(next(from, i + 1, current), i);
            operations.add(op);
            op.apply(from);

            fromCounts.dec(current);
        } else {
            Operation op = insert(i, current);
            operations.add(op);
            op.apply(from);
        }
    }

    return operations;
}

让您的头脑变得有点棘手,但基本上您会进行删除,以便您知道每个您正在插入或移动的密钥。然后你再次浏览列表,如果有足够的,你从列表中你还没有看到的部分移动一个,否则插入。当你到达终点时,一切都已排好。

I recently had to do this, except items could exist multiple times. This complicated things, but I was able to do it using look-ahead counters and some other craziness. It looks a lot like Rob's solution, so thanks to him for getting me started!

First off, let's assume that we want to return the list of operations that will transform the first list into the second:

public interface Operation {
    /**
     * Apply the operation to the given list.
     */
    void apply(List<String> keys);
}

and we have some helper methods to construct operations. You actually don't need a "move" operation, and you could even have a "swap" as well (or instead), but this is what I went with:

Operation delete(int index) { ... }
Operation insert(int index, String key) { ... }
Operation move(int from, int to) { ... }

Now we'll define a special class to hold our look-ahead counts:

class Counter {
    private Map<String, Integer> counts;

    Counter(List<String> keys) {
        counts = new HashMap<>();

        for (String key : keys) {
            if (counts.containsKey(key)) {
                counts.put(key, counts.get(key) + 1);
            } else {
                counts.put(key, 1);
            }
        }
    }

    public int get(String key) {
        if (!counts.containsKey(key)) {
            return 0;
        }

        return counts.get(key);
    }

    public void dec(String key) {
        counts.put(key, counts.get(key) - 1);
    }
}

And a helper method to get the index of the next key in the list:

int next(List<String> list, int start, String key) {
    for (int i = start; i < list.size(); i++) {
        if (list.get(i).equals(key)) {
            return i;
        }
    }

    throw new RuntimeException("next index not found for " + key);
}

Now we're ready to do the transform:

List<Operation> transform(List<String> from, List<String> to) {
    List<Operation> operations = new ArrayList<>();

    // make our own copy of the first, that we can mutate
    from = new ArrayList<>(from);

    // maintain lookahead counts
    Counter fromCounts = new Counter(from);
    Counter toCounts = new Counter(to);

    // do all our deletes first
    for (int i = 0; i < from.size(); i++) {
        String current = from.get(i);

        if (fromCounts.get(current) > toCounts.get(current)) {
            Operation op = delete(i);
            operations.add(op);
            op.apply(from);
            fromCounts.dec(current);
            i--;
        }
    }

    // then one more iteration for the inserts and moves
    for (int i = 0; i < to.size(); i++) {
        String current = to.get(i);

        if (from.size() > i && from.get(i).equals(current)) {
            fromCounts.dec(current);
            continue;
        }

        if (fromCounts.get(current) > 0) {
            Operation op = move(next(from, i + 1, current), i);
            operations.add(op);
            op.apply(from);

            fromCounts.dec(current);
        } else {
            Operation op = insert(i, current);
            operations.add(op);
            op.apply(from);
        }
    }

    return operations;
}

It's a bit tricky to get your head around, but basically you do the deletes so that you know for every key you are either inserting or moving. Then you run through the list again and if there's enough, you move one from the part of the list your haven't seen yet, otherwise insert. By the time you get to the end, it all lines up.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文