针对 Ruby 优化的位交错

发布于 2024-10-19 06:40:31 字数 1184 浏览 1 评论 0原文

诚然,在 Ruby 中优化位操作一开始就有点不匹配。除此之外,我正在寻找一个片段或一个 gem,可以交错两个任意整数坐标,最好可以用于 MRI (1.9) 或本地 gem。

C 中的一些方法是: http://graphics.stanford.edu/~seander/bithacks .html#InterleaveTableObvious

作为示例或起点,这里是 Ruby 中的“Interleave 位明显方式”,有点丑陋,以防止它创建临时数组(这会增加每个数组的运行时间约 2 倍)并使用二进制文件内联长度方法可进一步减少 6%(如果您知道两个输入都不为零,则可以省略该检查,再增加几个百分点。)

def interleave(y)
  z = 0
  bl = self > 0 ? Math.log2(self) : 1
  ybl = y > 0 ? Math.log2(y) : 1
  ((((bl <=> ybl) == -1) ? ybl : bl).floor + 1).times{|i| z |= (self & 1 << i) << i | (y & 1 << i) << (i + 1)}
  return z
end

来自 2.66Ghz i5 和 1.9.2p180 的结果:

x = y = 0b11111111_11111111_11111111_11111111
Benchmark.bm{|bm| bm.report{1000000.times{x.interleave(y)}}}

     user     system      total        real
18.360000   0.010000  18.370000 ( 18.356196)

肯定有更好的方法吗?


更新

我包括了@Wayne Conrad 的零修复,尽管比他的丑陋得多,而且速度也只是稍微快一点。还移动了楼层和+1,以便每次执行一次而不是两次。

这是匹配去交错的要点

Granted, optimizing bit twiddling in Ruby is a bit of a mismatch to begin with. That aside, I'm looking for a snippet or a gem that can interleave two arbitrary integer coords optimized as best can be for MRI (1.9) or a native gem.

Some approaches in C are: http://graphics.stanford.edu/~seander/bithacks.html#InterleaveTableObvious

As an example or starting point, here's "Interleave bits the obvious way" in Ruby, somewhat uglified to keep it from creating temp arrays (which increase the runtime by about 2X per array) and with a binary length method inlined for a further 6% decrease (If you know neither input is ever zero, you can omit that check for a few percent more..)

def interleave(y)
  z = 0
  bl = self > 0 ? Math.log2(self) : 1
  ybl = y > 0 ? Math.log2(y) : 1
  ((((bl <=> ybl) == -1) ? ybl : bl).floor + 1).times{|i| z |= (self & 1 << i) << i | (y & 1 << i) << (i + 1)}
  return z
end

Results from a 2.66Ghz i5 with 1.9.2p180:

x = y = 0b11111111_11111111_11111111_11111111
Benchmark.bm{|bm| bm.report{1000000.times{x.interleave(y)}}}

     user     system      total        real
18.360000   0.010000  18.370000 ( 18.356196)

Surely there's a better way?


Update

I included the zero fix from @Wayne Conrad, albeit far uglier than his and only marginally faster. Also moved the floor and + 1 so as to be executed once instead of twice per.

Here is a Gist of this with matching de-interleave.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

静待花开 2024-10-26 06:40:31

这是一个快速&俗气的实现让你继续前进,直到出现一个好的实现:

def mortanize(x, y)
  xs, ys = [x, y].map do |n|
    n.to_s(2)
  end
  nbits = [xs, ys].map(&:size).max
  xs, ys = [xs, ys].map do |n|
    ('0' * (nbits - n.size) + n).chars
  end
  ys.zip(xs).join.to_i(2)
end

正如你所料,它不是速度守护进程。在我的机器上,使用 MRI 1.8.7,它每秒计算大约 35,000 个 16 位结果。您的每秒计算 68,000 个 16 位结果。或者,请参阅每秒 256,000 个 16 位结果的下一个算法。


如果您愿意牺牲一点内存和启动时间来换取速度,那么:

def base_mortanize(x, y)
  xs, ys = [x, y].map do |n|
    n.to_s(2)
  end
  nbits = [xs, ys].map(&:size).max
  xs, ys = [xs, ys].map do |n|
    ('0' * (nbits - n.size) + n).chars
  end
  ys.zip(xs).join.to_i(2)
end

MORTON_TABLE_X = 256.times.map do |x|
  base_mortanize(x, 0)
end

MORTON_TABLE_Y = 256.times.map do |y|
  base_mortanize(0, y)
end

def mortanize(x, y)
  z = []
  while (x > 0 || y > 0)
    z << (MORTON_TABLE_X[x & 0xff] | MORTON_TABLE_Y[y & 0xff])
    x >>= 8
    y >>= 8
  end
  z.reverse.inject(0) do |result, word|
    result << 16 | word
  end
end

该程序每秒计算 256,000 个 16 位结果。


如果任一参数为零,则您的答案存在错误。这是一种可能的解决方法。首先定义这个函数:

def bit_size(x)
  return 1 if x == 0
  Math.log2(x).floor + 1
end

然后,在 interleave 中,将: 替换

z, bl, ybl = 0, (Math.log2(self)).floor + 1, (Math.log2(y)).floor + 1

为:

z = 0
bl = bit_size(x)
ybl = bit_size(y)

这是我使用的 rspec 测试用例:

describe "mortanize" do
  it "should interleave integers" do
    mortanize(0, 0).should eql 0
    mortanize(0, 1).should eql 2
    mortanize(1, 0).should eql 1
    mortanize(0xf, 0x3).should eql 0x5f
    mortanize(0x3, 0xf).should eql 0xaf
    mortanize(0xf, 0x0).should eql 0x55
    mortanize(0x0, 0xf).should eql 0xaa
    mortanize(0x3, 0xc).should eql 0xa5
    mortanize(0xf, 0xf).should eql 0xff
    mortanize(0x1234, 0x4321).should eql 0x210e0d12
  end
end

Here's a quick & cheesy implementation to get you going until a good one comes along:

def mortanize(x, y)
  xs, ys = [x, y].map do |n|
    n.to_s(2)
  end
  nbits = [xs, ys].map(&:size).max
  xs, ys = [xs, ys].map do |n|
    ('0' * (nbits - n.size) + n).chars
  end
  ys.zip(xs).join.to_i(2)
end

As you might expect, it's no speed deamon. On my box, with MRI 1.8.7, it computes about 35,000 16-bit results per second. Yours computes 68,000 16-bit results per second. Or, see the next algorithm for 256,000 16-bit results per second.


If you're willing to trade a little memory and startup time for speed, then:

def base_mortanize(x, y)
  xs, ys = [x, y].map do |n|
    n.to_s(2)
  end
  nbits = [xs, ys].map(&:size).max
  xs, ys = [xs, ys].map do |n|
    ('0' * (nbits - n.size) + n).chars
  end
  ys.zip(xs).join.to_i(2)
end

MORTON_TABLE_X = 256.times.map do |x|
  base_mortanize(x, 0)
end

MORTON_TABLE_Y = 256.times.map do |y|
  base_mortanize(0, y)
end

def mortanize(x, y)
  z = []
  while (x > 0 || y > 0)
    z << (MORTON_TABLE_X[x & 0xff] | MORTON_TABLE_Y[y & 0xff])
    x >>= 8
    y >>= 8
  end
  z.reverse.inject(0) do |result, word|
    result << 16 | word
  end
end

This one computes 256,000 16-bit results per second.


There's a bug in your answer if either argument is zero. Here's one possible fix for it. First define this function:

def bit_size(x)
  return 1 if x == 0
  Math.log2(x).floor + 1
end

And then, inside interleave, replace:

z, bl, ybl = 0, (Math.log2(self)).floor + 1, (Math.log2(y)).floor + 1

with:

z = 0
bl = bit_size(x)
ybl = bit_size(y)

Here is the rspec test case I used:

describe "mortanize" do
  it "should interleave integers" do
    mortanize(0, 0).should eql 0
    mortanize(0, 1).should eql 2
    mortanize(1, 0).should eql 1
    mortanize(0xf, 0x3).should eql 0x5f
    mortanize(0x3, 0xf).should eql 0xaf
    mortanize(0xf, 0x0).should eql 0x55
    mortanize(0x0, 0xf).should eql 0xaa
    mortanize(0x3, 0xc).should eql 0xa5
    mortanize(0xf, 0xf).should eql 0xff
    mortanize(0x1234, 0x4321).should eql 0x210e0d12
  end
end
↘紸啶 2024-10-26 06:40:31

这是另一种解决方案,对于 16 位整数(第一个解决方案仅适用于 8 位),基准测试比公认的解决方案快 50%:

Magic = [0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF]

# Interleave lower 16 bits of x and y, so the bits of x
# are in the even positions and bits from y in the odd;
# z gets the resulting 32-bit Morton Number.  
# x and y must initially be less than 65536.
# Rubyfied from http://graphics.stanford.edu/~seander/bithacks.html
def _interleave_bits_16b(x,y)
  x = (x | (x << 8)) & Magic[3]
  x = (x | (x << 4)) & Magic[2]
  x = (x | (x << 2)) & Magic[1]
  x = (x | (x << 1)) & Magic[0]
  y = (y | (y << 8)) & Magic[3]
  y = (y | (y << 4)) & Magic[2]
  y = (y | (y << 2)) & Magic[1]
  y = (y | (y << 1)) & Magic[0]
  z = x | (y << 1)
end

Here's another solution, benchmarked about 50% faster than the accepted one, and for 16-bit integers (where the first one only does 8-bit):

Magic = [0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF]

# Interleave lower 16 bits of x and y, so the bits of x
# are in the even positions and bits from y in the odd;
# z gets the resulting 32-bit Morton Number.  
# x and y must initially be less than 65536.
# Rubyfied from http://graphics.stanford.edu/~seander/bithacks.html
def _interleave_bits_16b(x,y)
  x = (x | (x << 8)) & Magic[3]
  x = (x | (x << 4)) & Magic[2]
  x = (x | (x << 2)) & Magic[1]
  x = (x | (x << 1)) & Magic[0]
  y = (y | (y << 8)) & Magic[3]
  y = (y | (y << 4)) & Magic[2]
  y = (y | (y << 2)) & Magic[1]
  y = (y | (y << 1)) & Magic[0]
  z = x | (y << 1)
end
静若繁花 2024-10-26 06:40:31

如果你已经有 C 实现,可以使用 FFI,否则可以直接借助帮助编写RubyInline

If you have an implementation already in C, you can use FFI, otherwise you can write it directly with the help of RubyInline

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文