Java 字节数组比较器(字典顺序)

发布于 2024-10-19 05:25:56 字数 75 浏览 1 评论 0原文

我有一个带有 byte[] 键的哈希图。我想通过 TreeMap 对它进行排序。

实现字典顺序比较器的最有效方法是什么?

I have a hashmap with byte[] keys. I'd like to sort it through a TreeMap.

What is the most effective way to implement the comparator for lexicographic order?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

另类 2024-10-26 05:25:56

使用 Guava,您可以使用以下任一选项:

UnsignedBytes 比较器似乎有一种使用 Unsafe 的优化形式,如果可以的话,它会使用它。代码中的注释表明它的速度至少是普通 Java 实现的两倍。

Using Guava, you can use either of:

The UnsignedBytes comparator appears to have an optimized form using Unsafe that it uses if it can. Comments in the code indicate that it may be at least twice as fast as a normal Java implementation.

在巴黎塔顶看东京樱花 2024-10-26 05:25:56

在 Apache Hbase 中发现了这段不错的代码:

    public int compare(byte[] left, byte[] right) {
        for (int i = 0, j = 0; i < left.length && j < right.length; i++, j++) {
            int a = (left[i] & 0xff);
            int b = (right[j] & 0xff);
            if (a != b) {
                return a - b;
            }
        }
        return left.length - right.length;
    }

Found this nice piece of code in Apache Hbase:

    public int compare(byte[] left, byte[] right) {
        for (int i = 0, j = 0; i < left.length && j < right.length; i++, j++) {
            int a = (left[i] & 0xff);
            int b = (right[j] & 0xff);
            if (a != b) {
                return a - b;
            }
        }
        return left.length - right.length;
    }
万劫不复 2024-10-26 05:25:56

我假设问题只是“字节与字节”的比较。处理数组很简单,所以我不会介绍它。关于字节与字节,我的第一个想法是这样做:

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    return new Byte(b1).compareTo(b2);
  }
}

但这不会按字典顺序排列:0xFF(-1 的有符号字节)将被视为小于 0x00,而按字典顺序它更大。我认为这应该可以解决问题:

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    // convert to unsigned bytes (0 to 255) before comparing them.
    int i1 = b1 < 0 ? 256 + b1 : b1;
    int i2 = b2 < 0 ? 256 + b2 : b2;
    return i2 - i1;
  }
}

也许 Apache 的 commons-lang 或 commons-math 库中有一些东西可以做到这一点,但我不知道它是如何实现的。

I'm assuming the problem is just with the "byte vs. byte" comparison. Dealing with the arrays is straightforward, so I won't cover it. With respect to byte vs. byte, my first thought is to do this:

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    return new Byte(b1).compareTo(b2);
  }
}

But that won't be lexicographic: 0xFF (the signed byte for -1) will be considered smaller than 0x00, when lexicographically it's bigger. I think this should do the trick:

public class ByteComparator implements Comparator<byte> {
  public int compare(byte b1, byte b2) {
    // convert to unsigned bytes (0 to 255) before comparing them.
    int i1 = b1 < 0 ? 256 + b1 : b1;
    int i2 = b2 < 0 ? 256 + b2 : b2;
    return i2 - i1;
  }
}

Probably there is something in Apache's commons-lang or commons-math libraries that does this, but I don't know it off hand.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文