如何有效地散列二维数组（存储在 HashSet 中）？

发布于 2024-09-29 02:20:32 字数 1359 浏览 0 评论 0原文

我编写了一个名为 PuzzleBoard 的类，它代表 nxn 板。我将在 HashSet 中保留多个 PuzzleBoard 对象，因此我必须覆盖“int hashCode()”方法。

下面是我的类的字段：

 private int N;
 private int[][] puzzle;
 private int blankCellX;
 private int blankCellY;
 private int cost;

Eclipse 自动为我生成的是：

 public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + N;
  result = prime * result + blankCellX;
  result = prime * result + blankCellY;
  result = prime * result + cost;
  result = prime * result + Arrays.hashCode(puzzle);
  return result;
 }

认为该方法没有考虑二维数组的内容，我将其更改为：

 public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + N;
  result = prime * result + blankCellX;
  result = prime * result + blankCellY;
  result = prime * result + cost;
  for (int i = 0; i < N; ++i)
   result = prime * result + Arrays.hashCode(puzzle[i]);
  return result;
 }

但是，该方法的问题是完成时间太长：O(N^2) 此外; 'result' 变量很可能会溢出。

现在，我的问题是，如何编写一个不需要太长时间即可完成的高效哈希方法。而且;在 HashSet 中插入或搜索对象应该是高效的（接近恒定时间）。

在最坏的情况下，N 将为 10，并且 HashSet 将包含约 1000 个 PuzzleBoard。

我为什么要做这一切？ 我正在使用 A* 算法实现 N-Puzzle 问题的解决方案。因此，在算法的某个阶段，给定当前节点（板的配置），我将空白单元向上、向下、向右或向左移动以生成新的子节点。因此，谜题配置通常会有 1 或 2 个单元格的差异。我将所有探索过的节点存储在 HashSet 中。

预先感谢 =)

原文

I've written a class called PuzzleBoard that represents an nxn board. I will be keeping several PuzzleBoard objects in a HashSet, so I have to overwrite the 'int hashCode()' method.

Below are the fields of my class:

 private int N;
 private int[][] puzzle;
 private int blankCellX;
 private int blankCellY;
 private int cost;

What Eclipse automatically generated for me was:

 public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + N;
  result = prime * result + blankCellX;
  result = prime * result + blankCellY;
  result = prime * result + cost;
  result = prime * result + Arrays.hashCode(puzzle);
  return result;
 }

Thinking that this method doesn't take into account the contents of the 2-d array, I changed it into this:

 public int hashCode() {
  final int prime = 31;
  int result = 1;
  result = prime * result + N;
  result = prime * result + blankCellX;
  result = prime * result + blankCellY;
  result = prime * result + cost;
  for (int i = 0; i < N; ++i)
   result = prime * result + Arrays.hashCode(puzzle[i]);
  return result;
 }

However, the problem with this method is that it takes too long to complete: O(N^2)
Furthermore; the 'result' variable is very likely to overflow.

Now, my question is, how do I write an efficient hash method that doesn't take too long to complete. Moreover; inserting or searching an object in the HashSet should be efficient (near constant time).

In the worst case, N will be 10 and the HashSet will contain ~1000 PuzzleBoards.

Why am I doing all this?
I'm implementing a solution for the N-Puzzle problem by using the A* algorithm. So in some phase of the algorithm, given the current node (configuration of the board), I'm moving the blank cell up, down, right or left to generate new child nodes. Because of this, puzzle configurations differ usually by 1 or 2 cells. I'm storing all the explored nodes in a HashSet.

Thanks in advance =)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

揪着可爱 2024-10-06 02:20:32

哈希码不需要是唯一的，如果是的话那就更好了。由于 HashSet 中的项目数量相对较少（~1000），因此您可以选择少量合适的数据来一起哈希。例如，也许您只需要“拼图”表的第一行，或者“成本”变量对于不同的实例可能有足够的不同，您可以将其用作差异的良好来源。

结果是否溢出并不重要：您想要的只是让不同的对象在可能的情况下返回不同的哈希码。哈希值的实际值并不重要。

回复收藏 0 原文

纵性 2024-10-06 02:20:32

此方法不考虑二维数组的内容

您还可以使用 util.Arrays#deepHashCode()。

但是这个方法的问题是完成时间太长：O(N^2)

如果你想对其中所有的 N^2 个整数进行哈希处理，你就不能走得更快吗？如果 N 至多为 10，那么 Big-O 表示法又是什么呢？ O(n^2) 并不意味着慢。我不认为你的 hashCode 方法效率低下。低效率或一些 O(n^2) 很可能是在其他地方......不过，如果经常调用此方法（并且 PuzzleBoard 是不可变的），您可能希望缓存 hashCode 值。

“结果”变量很可能溢出。

没问题！ Java 中定义了溢出。

此外；在 HashSet 中插入或搜索对象应该是高效的（接近恒定时间）。

插入很可能只是摊销常数时间。当 HashSet 满了时，将创建一个新的更大的 HashSet。所有元素都被复制到其中，所有的 hashCode 都必须重新计算。尝试为 HashSet 设置一个初始容量？

result = prime * result + cost;

您确定要将成本（我假设是深度）包含在 equals 和 hashCode 中吗？无论我花了多少步骤才到达那里，两个配置都是相同的，对吗？

~1000 个拼图板

如果我没记错的话，上次我解决这个谜题时我有很多超过 1000 个配置。

this method doesn't take into account the contents of the 2-d array

You could also use util.Arrays#deepHashCode().

However, the problem with this method is that it takes too long to complete: O(N^2)

You can't go faster if you want to hash all of the N^2 ints in it? If N is at most 10, what's with the Big-O notation anyway? O(n^2) does not mean slow. I don't think your hashCode method is inefficient. The inefficiency or some O(n^2) is most likely somewhere else... Still if this method is called often (and PuzzleBoard is immutable) you might want to cache the hashCode value.

the 'result' variable is very likely to overflow.

No problem! Overflows are defined in Java.

Moreover; inserting or searching an object in the HashSet should be efficient (near constant time).

Inserting is most likely only amortized constant time. When the HashSet gets full, a new bigger HashSet will be made. All elements are copied in it, all the hashCodes will have to be calculated again. Try setting an initialCapacity for the HashSet?

result = prime * result + cost;

Are you sure you want the cost (I assume it's the depth) to be included in equals and hashCode? Two configurations are the same no matter how many steps it took me to get there, right?