Java：平等器？（从对象集合中删除重复项）

发布于 2024-08-16 11:06:15 字数 951 浏览 4 评论 0原文

我有一堆Puzzle类的对象。我已经重写了 equals() 和 hashCode()。当需要向用户展示解决方案时，我想过滤掉所有“相似”的谜题（按照我定义的标准），因此用户只能看到其中的一个。

相似性是传递性的。

示例：

Result of computations:
A    (similar to A)
B    (similar to C)
C
D

在这种情况下，只会向用户呈现 A 或 D 和 B 或 C，而不是两个相似的谜题。两个相似的谜题同样有效。唯一重要的是它们不能同时显示给用户。

为了实现这一点，我想使用禁止重复的 ADT。但是，我不想更改 equals() 和 hashCode() 方法来返回有关相似性的值。在这种情况下，我可以使用一些Equalator（例如Comparator）吗？或者我应该采取另一种方式来做到这一点？

我正在做的课程是一个维护字母网格的谜题。（就像拼字游戏一样。）如果拼图包含相同的单词，但方向不同，则被认为是相似的。因此，以下谜题：

                                    (2, 2): A           
                                    (2, 1): C           
                                    (2, 0): T

将类似于：

                    (1, 2): A           
                    (1, 1): C           
                    (1, 0): T

原文

I have a bunch of objects of a class Puzzle. I have overridden equals() and hashCode(). When it comes time to present the solutions to the user, I'd like to filter out all the Puzzles that are "similar" (by the standard I have defined), so the user only sees one of each.

Similarity is transitive.

Example:

Result of computations:
A    (similar to A)
B    (similar to C)
C
D

In this case, only A or D and B or C would be presented to the user - but not two similar Puzzles. Two similar puzzles are equally valid. It is only important that they are not both shown to the user.

To accomplish this, I wanted to use an ADT that prohibits duplicates. However, I don't want to change the equals() and hashCode() methods to return a value about similarity instead. Is there some Equalator, like Comparator, that I can use in this case? Or is there another way I should be doing this?

The class I'm working on is a Puzzle that maintains a grid of letters. (Like Scrabble.) If a Puzzle contains the same words, but is in a different orientation, it is considered to be similar. So the following to puzzle:

                                    (2, 2): A           
                                    (2, 1): C           
                                    (2, 0): T

Would be similar to:

                    (1, 2): A           
                    (1, 1): C           
                    (1, 0): T

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

请持续率性 2024-08-23 11:06:15

好吧，你有一种测量对象之间相似性的方法。这意味着它们形成了一个度量空间。

问题是，你的空间是否也是一个欧几里得空间就像正常的三维空间，或者整数或类似的东西？如果是，那么您可以在任意多个维度上使用二进制空间分区。

（问题基本上是：你的对象和 n 维实数向量之间是否存在同态？如果是，那么你可以使用测量 n 维空间中点的接近度的技术。）

现在，如果它是 不是欧几里得空间那么你就会遇到更大的问题。程序员可能最熟悉的非欧几里得空间的一个例子是 Levenshtein 距离到字符串之间。

如果您的问题类似于查看字符串与现有字符串列表的相似程度，那么我不知道有任何算法可以在没有 O(n²) 时间。也许那里有一些。

但另一个重要的问题是：你有多少时间？有多少个物体？如果您有时间或者您的数据集足够小，以至于 O(n²) 算法是实用的，那么您只需迭代对象列表即可查看它是否低于某个阈值。如果是这样，请拒绝它。

只需重载 AbstractCollection 并替换添加功能。使用 ArrayList 或其他。您的代码看起来有点像这样，

class SimilarityRejector<T> extends AbstractCollection<T>{
     ArrayList<T> base;
     double threshold;

    public SimilarityRejector(double threshold){
        base = new ArrayList<T>();
        this.threshold = threshold;
    }

    public void add(T t){
       boolean failed = false;
       for(T compare : base){
          if(similarityComparison(t,compare) < threshold) faled = true;
       }
       if(!failed) base.add(t);
     }

    public Iterator<T> iterator() {
        return base.iterator();
    }

    public int size() {
        return base.size();
    }
}

等等。显然 T 需要是您可以对其进行比较的某个类的子类。如果您有欧几里德度量，那么您可以使用空间分区，而不是遍历所有其他项目。

Okay you have a way of measuring similarity between objects. That means they form a Metric Space.

The question is, is your space also a Euclidean space like normal three dimensional space, or integers or something like that? If it is, then you could use a binary space partition in however many dimensions you've got.

(The question is, basically: is there a homomorphism between your objects and an n-dimensional real number vector? If so, then you can use techniques for measuring closeness of points in n-dimensional space.)

Now, if it's not a euclidean space then you've got a bigger problem. An example of a non-euclidean space that programers might be most familiar with would be the Levenshtein Distance between to strings.

If your problem is similar to seeing how similar a string is to a list of already existing strings then I don't know of any algorithms that would do that without O(n²) time. Maybe there are some out there.

But another important question is: how much time do you have? How many objects? If you have time or if your data set is small enough that an O(n²) algorithm is practical, then you just have to iterate through your list of objects to see if it's below a certain threshold. If so, reject it.

Just overload AbstractCollection and replace the Add function. Use an ArrayList or whatever. Your code would look kind of like this

class SimilarityRejector<T> extends AbstractCollection<T>{
     ArrayList<T> base;
     double threshold;

    public SimilarityRejector(double threshold){
        base = new ArrayList<T>();
        this.threshold = threshold;
    }

    public void add(T t){
       boolean failed = false;
       for(T compare : base){
          if(similarityComparison(t,compare) < threshold) faled = true;
       }
       if(!failed) base.add(t);
     }

    public Iterator<T> iterator() {
        return base.iterator();
    }

    public int size() {
        return base.size();
    }
}

etc. Obviously T would need to be a subclass of some class that you can perform a comparison on. If you have a euclidean metric, then you can use a space partition, rather then going through every other item.

回复收藏 0 原文

清风疏影 2024-08-23 11:06:15

我将使用一个包装类来相应地覆盖 equals 和 hashCode 。

private static class Wrapper {
    public static final Puzzle puzzle;
    public Wrapper(Puzzle puzzle) { 
        this.puzzle = puzzle; 
    }
    @Override 
    public boolean equals(Object object) {
        // ...
    }
    @Override 
    public int hashCode() {
        // ...
    }
}

然后你把所有的谜题包起来，把它们放在地图上，然后再把它们拿出来……

public Collection<Collection<Puzzle>> method(Collection<Puzzles> puzzles) {
    Map<Wrapper,<Collection<Puzzle>> map = new HashMap<Wrapper,<Collection<Puzzle>>();
    for (Puzzle each: puzzles) {
        Wrapper wrapper = new Wrapper(each);
        Collection<Puzzle> coll = map.get(wrapper);
        if (coll == null) map.put(wrapper, coll = new ArrayList<Puzzle>());
        coll.add(puzzle);
    }
    return map.values();
}

I'd use a wrapper class that overrides equals and hashCode accordingly.

private static class Wrapper {
    public static final Puzzle puzzle;
    public Wrapper(Puzzle puzzle) { 
        this.puzzle = puzzle; 
    }
    @Override 
    public boolean equals(Object object) {
        // ...
    }
    @Override 
    public int hashCode() {
        // ...
    }
}

and then you wrap all your puzzles, put them in a map, and get them out again…

public Collection<Collection<Puzzle>> method(Collection<Puzzles> puzzles) {
    Map<Wrapper,<Collection<Puzzle>> map = new HashMap<Wrapper,<Collection<Puzzle>>();
    for (Puzzle each: puzzles) {
        Wrapper wrapper = new Wrapper(each);
        Collection<Puzzle> coll = map.get(wrapper);
        if (coll == null) map.put(wrapper, coll = new ArrayList<Puzzle>());
        coll.add(puzzle);
    }
    return map.values();
}

回复收藏 0 原文

梦纸 2024-08-23 11:06:15

使用比较器创建 TreeSet
将所有元素添加到集合中
所有重复项都被删除

回复收藏 0 原文

难理解 2024-08-23 11:06:15

通常“相似性”不是传递关系。因此，第一步是从等效性而不是相似性的角度来考虑这一点。等价是自反的、对称的和传递的。

这里的简单方法是定义一个谜题包装器，其 equals() 和 hashCode() 方法是根据所讨论的等价关系实现的。

完成后，将包装的对象放入 java.util.Set 中并过滤掉重复项。

回复收藏 0 原文

海风掠过北极光 2024-08-23 11:06:15

恕我直言，Gili（带有自定义比较器的 TreeSet）描述了最优雅的方式。

但如果你想自己做，这似乎是最简单、最清晰的解决方案：

/**
 * Distinct input list values (cuts duplications)
 * @param items items to process
 * @param comparator comparator to recognize equal items
 * @return new collection with unique values
 */
public static <T> Collection<T> distinctItems(List<T> items, Comparator<T> comparator) {
    List<T> result = new ArrayList<>();

    for (int i = 0; i < items.size(); i++) {
        T item = items.get(i);

        boolean exists = false;
        for (int j = 0; j < result.size(); j++) {
            if (comparator.compare(result.get(j), item) == 0) {
                exists = true;
                break;
            }
        }

        if (!exists) {
            result.add(item);
        }
    }

    return result;
}

IMHO, most elegant way was described by Gili (TreeSet with custom Comparator).

But if you like to make it by yourself, seems this easiest and clearest solution:

/**
 * Distinct input list values (cuts duplications)
 * @param items items to process
 * @param comparator comparator to recognize equal items
 * @return new collection with unique values
 */
public static <T> Collection<T> distinctItems(List<T> items, Comparator<T> comparator) {
    List<T> result = new ArrayList<>();

    for (int i = 0; i < items.size(); i++) {
        T item = items.get(i);

        boolean exists = false;
        for (int j = 0; j < result.size(); j++) {
            if (comparator.compare(result.get(j), item) == 0) {
                exists = true;
                break;
            }
        }

        if (!exists) {
            result.add(item);
        }
    }

    return result;
}

回复收藏 0 原文

~没有更多了~