实施 RawComparator 真的那么快吗?

发布于 2024-12-01 13:38:04 字数 232 浏览 0 评论 0原文

实现 RawComparator 比扩展 WritableComparator 快得多吗?查看 Text/LongWritable/etc 及其内置比较器,似乎它们基本上只是直接从完整字节数组中读取字段,而不是使用 DataInput 并将值填充到键类中。

就我而言,我有一个自定义键类,具有多个字段,混合类型,包括一些字符串。尝试使用 RawComparator 来完成它有点让我害怕,因为它看起来(至少在表面上)可能很难正确实现。

Is implementing the RawComparator that much faster than extending WritableComparator? Looking at Text/LongWritable/etc, and their built-in comparators, it seems that they basically just read in the fields directly from the full byte array, instead of having a DataInput be used, and filling in the values into the key class.

In my case, I've got a custom key class, with multiple fields, of mixed types including some Strings. Trying to do it up with RawComparator sorta scares me, since it looks, at least on the surface, as possibly difficult to implement correctly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

秋千易 2024-12-08 13:38:04

是的,当您 100% 确定字节到字节的比较反映了数据等效性时,原始比较器绝对是好的。

您可以使用 apache 的 Thrift 或 avro 等库来为您处理二进制序列化——在这种情况下,您不必担心原始数据在二进制中编码不一致。

二进制比较总是比对象反序列化更快......但是“这么多”大师?好吧,这取决于你如何定义“那么多”:)

Yes your right that raw comparators are definetly good when you're 100% sure the byte-to-byte comparisons reflect the data equivalence.

You could use a library such as apache's Thrift or avro to handle the binary serialization for you --- in this case, you won't have to worry about your raw data being inconsistently encoded in binary .

Binary comparisons are always faster than object de serialization... But "that much" master? Well that depends on how you define "that much" :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文