比较字符串或字节数组更快吗?

发布于 2024-12-01 08:15:52 字数 296 浏览 5 评论 0原文

所以,可能听起来像是一个奇怪的问题,但是比较 2 个字符串或 byte[] 的速度更快吗(使用 Arrays.equals())?我正在使用 Hadoop/Hbase,我从 Hbase 获取 byte[] 作为值,并且我有一个传入的值。将我获得的值转换为字符串并进行比较会更快吗?或者将它们与字节数组进行比较?

So, might sound like an odd question, but is it faster to compare 2 String's, or byte[]'s (using Arrays.equals())? I'm working with Hadoop/Hbase, and I get byte[] as the value from Hbase, and I have a value that is passed in. Will it be faster to convert the value I get to a String and compare? Or compare them as to byte arrays?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

时光暖心i 2024-12-08 08:15:52

如果没有实际测试,Array.equals() 似乎是你的朋友。要创建字符串,您最终需要在 String 构造函数中创建字节数组的副本,然后必须对 unicode 进行解码,这涉及为默认 Unicode 编码创建解码器,并将字节数组转换为 char 数组,然后您必须执行等于,这涉及迭代每个字符串中的每个字符。

因此,在 O() 类型计算中,您必须读取数组中的每个字节才能转换为字符,因此我认为转换为 String 以获得 equals 的复杂性更差。

更新:
鉴于添加到问题中的注释,听起来您好像获得了一个字符串,并使用它来与 MapReduce 作业中的多个结果进行比较。在这种情况下,似乎有一次将输入字符串转换为字节,然后进行多个字节数组比较。这似乎比保留输入字符串并转换作业中返回的每个字节数组更快。

Without actually testing this it would seem that Array.equals() is your friend. To make a string you end up making a copy of the byte array in the String constructor, then you have to decode the unicode, which involves creating a decoder for the default Unicode encoding, and converting the byte array into a char array, then you have to do the equals, which involves iterating through every character in each of the strings.

So on a O() type calculation you already have to read every byte in the array to do the conversion to a character, so I'd say the complexity is worse for converting to String for equals.

Update:
Given the comments added to the question, it sounds like you are given a String and are using it to compare to multiple results in the MapReduce job. In this case it seems that there is one conversion of the input String to bytes and them multiple byte array comparisons. This seems faster than leaving the input String and converting every byte array returned in the job.

听闻余生 2024-12-08 08:15:52

首先,您必须考虑两个字符串是否具有相同的编码。
然后,如果您只想进行等于检查,则继续进行字节比较。但如果你想拥有 String 的 compareTo 行为,那么你可能必须弄清楚如何知道哪个字符串更大或更小,在这种情况下我更愿意先转换为 String 然后再比较。

如果它们的编码不同,那么最好创建字符串然后进行比较,因为解码部分将由 String 类本身完成。

Firstly, You have to consider whether both the strings are of same encoding.
Then if you just want to have an equals check then go ahead with byte comparison. But if you want to have the compareTo behavior of String, then you may have to figure out how to know which string is greater or lesser, in which case I would prefer converting to String first and then compare.

If they are not of same encoding, then its better to create Strings and then compare as the decoding part will be done by String class itself.

Spring初心 2024-12-08 08:15:52

首先,你应该问自己这是否真的重要。鉴于您正在处理 HBase 以及网络通信,无论您做什么,时间上都可能完全被淹没。就像@Clint和@Suraj一样,我认为减少方法调用(即使用 Array.equals() )可能会更好。想想当您执行字符串等于时会发生什么,然后添加将字节数组转换为字符串的开销。

First, you should ask yourself if it really matters. Given that you are dealing with HBase, and thus network communication, whatever you do may be completely swamped, time-wise. Like @Clint and @Suraj, I think your probably better off with fewer method calls (i.e. using Array.equals() ). Just think of what has to happen when you do a String equals, and then add in the overhead of converting the byte-arrays to Strings.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文