在 Ruby 中解压签名的小端字节序

发布于 2024-10-20 21:39:03 字数 528 浏览 7 评论 0原文

所以我正在研究一些 MongoDB 协议的东西。所有整数都是有符号的小端字节序。使用 Ruby 的标准 Array#pack 方法,我可以将整数从整数转换为我想要的二进制字符串:

positive_one = Array(1).pack('V')   #=> '\x01\x00\x00\x00'
negative_one = Array(-1).pack('V')  #=> '\xFF\xFF\xFF\xFF'

但是,反之亦然,String# unpack 方法具有记录为专门返回无符号整数的“V”格式:

positive_one.unpack('V').first #=> 1
negative_one.unpack('V').first #=> 4294967295

没有用于有符号小端字节顺序的格式化程序。我确信我可以使用位移来玩游戏,或者编写自己的不使用数组打包的字节重整方法,但我想知道是否有其他人遇到过这个问题并找到了一个简单的解决方案。非常感谢。

So I'm working on some MongoDB protocol stuff. All integers are signed little-endian. Using Ruby's standard Array#pack method, I can convert from an integer to the binary string I want just fine:

positive_one = Array(1).pack('V')   #=> '\x01\x00\x00\x00'
negative_one = Array(-1).pack('V')  #=> '\xFF\xFF\xFF\xFF'

However, going the other way, the String#unpack method has the 'V' format documented as specifically returning unsigned integers:

positive_one.unpack('V').first #=> 1
negative_one.unpack('V').first #=> 4294967295

There's no formatter for signed little-endian byte order. I'm sure I could play games with bit-shifting, or write my own byte-mangling method that doesn't use array packing, but I'm wondering if anyone else has run into this and found a simple solution. Thanks very much.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

请你别敷衍 2024-10-27 21:39:03

使用"V"解压后,您可以应用以下转换

class Integer
  def to_signed_32bit
    if self & 0x8000_0000 == 0x8000_0000
      self - 0x1_0000_0000  
    else
      self
    end
  end
end

您需要更改魔术常量0x1_0000_0000(即2**32) 和 0x8000_0000 (2**31) 如果您正在处理其他大小的整数。

After unpacking with "V", you can apply the following conversion

class Integer
  def to_signed_32bit
    if self & 0x8000_0000 == 0x8000_0000
      self - 0x1_0000_0000  
    else
      self
    end
  end
end

You'll need to change the magic constants 0x1_0000_0000 (which is 2**32) and 0x8000_0000 (2**31) if you're dealing with other sizes of integers.

伴我老 2024-10-27 21:39:03

这个问题有一个方法将有符号转换为无符号可能会有所帮助。它还有一个指向 bindata gem 的指针,看起来它会做你想做的事情。

BinData::Int16le.read("\000\f") # 3072

[编辑以删除不太正确的 s unpack 指令]

This question has a method for converting signed to unsigned that might be helpful. It also has a pointer to the bindata gem which looks like it will do what you want.

BinData::Int16le.read("\000\f") # 3072

[edited to remove the not-quite-right s unpack directive]

—━☆沉默づ 2024-10-27 21:39:03

编辑我误解了您最初转换的方向(根据评论)。但经过一番思考,我相信解决方案仍然是一样的。这是更新的方法。它执行完全相同的操作,但注释应该解释结果:

def convertLEToNative( num )
    # Convert a given 4 byte integer from little-endian to the running
    # machine's native endianess.  The pack('V') operation takes the
    # given number and converts it to little-endian (which means that
    # if the machine is little endian, no conversion occurs).  On a
    # big-endian machine, the pack('V') will swap the bytes because
    # that's what it has to do to convert from big to little endian.  
    # Since the number is already little endian, the swap has the
    # opposite effect (converting from little-endian to big-endian), 
    # which is what we want. In both cases, the unpack('l') just 
    # produces a signed integer from those bytes, in the machine's 
    # native endianess.
    Array(num).pack('V').unpack('l')
end

可能不是最干净的,但这将转换字节数组。

def convertLEBytesToNative( bytes )
    if ( [1].pack('V').unpack('l').first == 1 )
        # machine is already little endian
        bytes.unpack('l')
    else
        # machine is big endian
        convertLEToNative( Array(bytes.unpack('l')))
    end
end

Edit I misunderstood the direction you were converting originally (according to the comment). But after thinking about it some, I believe the solution is still the same. Here is the updated method. It does the exact same thing, but the comments should explain the result:

def convertLEToNative( num )
    # Convert a given 4 byte integer from little-endian to the running
    # machine's native endianess.  The pack('V') operation takes the
    # given number and converts it to little-endian (which means that
    # if the machine is little endian, no conversion occurs).  On a
    # big-endian machine, the pack('V') will swap the bytes because
    # that's what it has to do to convert from big to little endian.  
    # Since the number is already little endian, the swap has the
    # opposite effect (converting from little-endian to big-endian), 
    # which is what we want. In both cases, the unpack('l') just 
    # produces a signed integer from those bytes, in the machine's 
    # native endianess.
    Array(num).pack('V').unpack('l')
end

Probably not the cleanest, but this will convert the byte array.

def convertLEBytesToNative( bytes )
    if ( [1].pack('V').unpack('l').first == 1 )
        # machine is already little endian
        bytes.unpack('l')
    else
        # machine is big endian
        convertLEToNative( Array(bytes.unpack('l')))
    end
end
粉红×色少女 2024-10-27 21:39:03

为了子孙后代,这是我在发现 Paul Rubel 的链接到 “经典方法”。它很笨拙并且基于字符串操作,所以我可能会废弃它,但它确实有效,所以有一天有人可能会因为其他原因而发现它有趣:

# Returns an integer from the given little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  bits = str.reverse.unpack('B*').first   # Get the 0s and 1s
  if bits[0] == '0'   # We're a positive number; life is easy
    bits.to_i(2)
  else                # Get the twos complement
    comp, flip = "", false
    bits.reverse.each_char do |bit|
      comp << (flip ? bit.tr('10','01') : bit)
      flip = true if !flip && bit == '1'
    end
    ("-" + comp.reverse).to_i(2)
  end
end

更新:这是更简单的重构,使用通用肯·布鲁姆答案的任意长度形式:

# Returns an integer from the given arbitrary length little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  arr, bits, num = str.unpack('V*'), 0, 0
  arr.each do |int|
    num += int << bits
    bits += 32
  end
  num >= 2**(bits-1) ? num - 2**bits : num  # Convert from unsigned to signed
end

For the sake of posterity, here's the method I eventually came up with before spotting Paul Rubel's link to the "classical method". It's kludgy and based on string manipulation, so I'll probably scrap it, but it does work, so someone might find it interesting for some other reason someday:

# Returns an integer from the given little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  bits = str.reverse.unpack('B*').first   # Get the 0s and 1s
  if bits[0] == '0'   # We're a positive number; life is easy
    bits.to_i(2)
  else                # Get the twos complement
    comp, flip = "", false
    bits.reverse.each_char do |bit|
      comp << (flip ? bit.tr('10','01') : bit)
      flip = true if !flip && bit == '1'
    end
    ("-" + comp.reverse).to_i(2)
  end
end

UPDATE: Here's the simpler refactoring, using a generalized arbitrary-length form of Ken Bloom's answer:

# Returns an integer from the given arbitrary length little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  arr, bits, num = str.unpack('V*'), 0, 0
  arr.each do |int|
    num += int << bits
    bits += 32
  end
  num >= 2**(bits-1) ? num - 2**bits : num  # Convert from unsigned to signed
end
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文