ARM 上的 Ruby on Rails 性能

发布于 2024-09-29 03:44:47 字数 3036 浏览 2 评论 0原文

我想知道我们是否可以用一些等效的基于 ARM 的设备替换运行 Rails (ruby 1.8.6...) web 应用程序的基于 Atom N270 的上网本(我们喜欢无风扇设置、功耗等)。

ARM 设备是 XScale-PXA270 @ 520,128MB(可能还有一些较慢的 SDRAM),运行 Linux,总是有足够的可用内存,其性能与越狱的 iPhone 相当。

对生产数据库 (SQLite) 进行基准测试给了我们有希望的结果(ARM 只是 慢 20-30%),所以我尝试构建 ruby​​(1.9.2p0)。

Rails 应用程序在 ARM 上运行速度非常慢(从 sql 获取并生成模板速度慢 10-20 倍)。我决定运行一些基准测试来查找瓶颈。

同样,有些结果还不错(与我们现在使用的较旧的 ruby​​ 1.8.6 相当,比 ruby​​ 1.9.2 慢 6 倍),而有些结果非常慢(慢 20-30 倍)。铁。看起来哈希方法在 ARM 上慢了 40 倍。运行 Ruby Benchmark Suite 显示更多瓶颈、字符串、线程、数组...

我知道 ARM比 Atom 慢,我只是没想到会有如此巨大的差异,尤其是在 SQLite 运行良好之后。

ARM 上的 Ruby 是否存在一些缺陷,我是否需要应用一些补丁,如果我想使用 ARM 设备或者只是该设备没有足够的计算能力,这是没有希望的,应该用 C 重写整个应用程序吗?

示例

def fib(n) 
  return 1 if n < 2
  fib(n-1)+fib(n-2)
end 

Benchmark.bm do |x| 
  x.report { fib(32) }
  x.report { fib(36) }
  x.report { h = {}; (0..10**3).each {|i| h[i] = i}  } 
  x.report { h = {}; (0..10**4).each {|i| h[i] = i}  } 
  x.report { h = {}; (0..10**5).each {|i| h[i] = i}  } 
end
ruby -rbenchmark bench.rb

Atom N270,1GB

ruby 1.9.2p0 (2010-08-18) [i686-linux]
      user     system      total        real
  2.440000   0.000000   2.440000 (  2.459400)
 16.780000   0.030000  16.810000 ( 17.293015)
  0.000000   0.000000   0.000000 (  0.001180)
  0.020000   0.000000   0.020000 (  0.012180)
  0.160000   0.000000   0.160000 (  0.161803)

ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
      user     system      total        real
 12.500000   0.020000  12.520000 ( 12.628106)
 84.450000   0.170000  84.620000 ( 85.879380)
  0.010000   0.000000   0.010000 (  0.002216)
  0.040000   0.000000   0.040000 (  0.032939)
  0.240000   0.010000   0.250000 (  0.255756)

XScale-PXA270 @ 520,128MB ruby 1.9.2p0 (2010-08-18) [arm-linux]

      user     system      total        real
 12.470000   0.000000  12.470000 ( 12.526507)
 85.480000   0.000000  85.480000 ( 85.939294)
  0.060000   0.000000   0.060000 (  0.060643)
  0.640000   0.000000   0.640000 (  0.642136)
  6.460000   0.130000   6.590000 (  6.605553)

构建:

 ./configure --host=arm-linux --without-X11 --disable-largefile \
--enable-socket=yes --without-Win32API --disable-ipv6 \
--disable-install-doc --prefix=/opt --with-openssl-include=/opt/include/ \
--with-openssl-lib=/opt/include/lib

ENV:

PFX=arm-iwmmxt-linux-gnueabi

export DISCIMAGE="/opt"
export CROSS_COMPILE="arm-linux-"
export HOST="arm-linux"
export TARGET="arm-linux"
export CROSS_COMPILING=1
export CC=$PFX-gcc
export CFLAGS="-O3 -I/opt/include"
export LDFLAGS="-O3 -L/opt/lib/"
#LIBS=
#CPPFLAGS=
export CXX=$PFX-g++
#CXXFLAGS=
export CPP=$PFX-cpp

export OBJCOPY="$PFX-objcopy"
export LD="$PFX-ld"
export AR="$PFX-ar" 
export RANLIB="$PFX-ranlib"
export NM="$PFX-nm"
export STRIP="$PFX-strip"
export ac_cv_func_setpgrp_void=yes
export ac_cv_func_isinf=no
export ac_cv_func_isnan=no
export ac_cv_func_finite=no

I was wondering if we could replace our Atom N270 based nettops that are running a Rails (ruby 1.8.6...) webapp with some equivalent ARM based device (we like the fanless setup, power consumption, etc.).

The ARM device was XScale-PXA270 @ 520, 128MB (and probably some slower SDRAMs), running linux, there was always enough free memory with comparable performance as a jailbroken iPhone.

Benchmarking the production database (SQLite) gave us promising results (ARM was just
20-30% slower), so I tried to build ruby (1.9.2p0).

The rails app was running very slowly on ARM (fetching from sql and generating templates 10-20x slower). I've decided run some benchmarks to find bottlenecks.

Again, some results were ok (on par with older ruby 1.8.6 we are using now, 6x slower than ruby 1.9.2), and some were very slow (20-30x slower). Fe. it looks that hash methods are 40x slower on ARM. Running Ruby Benchmark Suite showed more bottlenecks, strings, threads, arrays...

I knew ARM is slower than Atom, I was just not expecting such a huge difference, especially after SQLite was running fine.

Is there some flaw with Ruby on ARM, do I need to apply some patches, is this hopeless and should rewrite the whole app in C if I want to use the ARM device or just the device has not enough computing power?

Examples

def fib(n) 
  return 1 if n < 2
  fib(n-1)+fib(n-2)
end 

Benchmark.bm do |x| 
  x.report { fib(32) }
  x.report { fib(36) }
  x.report { h = {}; (0..10**3).each {|i| h[i] = i}  } 
  x.report { h = {}; (0..10**4).each {|i| h[i] = i}  } 
  x.report { h = {}; (0..10**5).each {|i| h[i] = i}  } 
end
ruby -rbenchmark bench.rb

Atom N270, 1GB

ruby 1.9.2p0 (2010-08-18) [i686-linux]
      user     system      total        real
  2.440000   0.000000   2.440000 (  2.459400)
 16.780000   0.030000  16.810000 ( 17.293015)
  0.000000   0.000000   0.000000 (  0.001180)
  0.020000   0.000000   0.020000 (  0.012180)
  0.160000   0.000000   0.160000 (  0.161803)

ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-linux]
      user     system      total        real
 12.500000   0.020000  12.520000 ( 12.628106)
 84.450000   0.170000  84.620000 ( 85.879380)
  0.010000   0.000000   0.010000 (  0.002216)
  0.040000   0.000000   0.040000 (  0.032939)
  0.240000   0.010000   0.250000 (  0.255756)

XScale-PXA270 @ 520, 128MB
ruby 1.9.2p0 (2010-08-18) [arm-linux]

      user     system      total        real
 12.470000   0.000000  12.470000 ( 12.526507)
 85.480000   0.000000  85.480000 ( 85.939294)
  0.060000   0.000000   0.060000 (  0.060643)
  0.640000   0.000000   0.640000 (  0.642136)
  6.460000   0.130000   6.590000 (  6.605553)

Build with:


 ./configure --host=arm-linux --without-X11 --disable-largefile \
--enable-socket=yes --without-Win32API --disable-ipv6 \
--disable-install-doc --prefix=/opt --with-openssl-include=/opt/include/ \
--with-openssl-lib=/opt/include/lib

ENV:

PFX=arm-iwmmxt-linux-gnueabi

export DISCIMAGE="/opt"
export CROSS_COMPILE="arm-linux-"
export HOST="arm-linux"
export TARGET="arm-linux"
export CROSS_COMPILING=1
export CC=$PFX-gcc
export CFLAGS="-O3 -I/opt/include"
export LDFLAGS="-O3 -L/opt/lib/"
#LIBS=
#CPPFLAGS=
export CXX=$PFX-g++
#CXXFLAGS=
export CPP=$PFX-cpp

export OBJCOPY="$PFX-objcopy"
export LD="$PFX-ld"
export AR="$PFX-ar" 
export RANLIB="$PFX-ranlib"
export NM="$PFX-nm"
export STRIP="$PFX-strip"
export ac_cv_func_setpgrp_void=yes
export ac_cv_func_isinf=no
export ac_cv_func_isnan=no
export ac_cv_func_finite=no

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

当梦初醒 2024-10-06 03:44:47

您似乎在抱怨 Ruby 1.9.2 中的新优化(与 1.8.x 相比)是特定于 x86 的。 Ruby 1.8.x 的 Atom 和 ARM 性能相当。也许您可以询问特定于 Ruby 的邮件列表。快速搜索表明,是的,Ruby 1.9.x 中有很多变化:

Ruby 1.9.2 通过 Yet Another Ruby VM (YARV) 解释器为 Ruby 带来了重大速度改进

也许正确的问题是“YARV 是否有 x86 特定优化?这些优化可以在 ARM 端口中重复吗? ”

It seems you're complaining that optimizations new in Ruby 1.9.2 (when compared to 1.8.x) are x86 specific. The Atom and ARM performance is comparable for Ruby 1.8.x. Perhaps you could ask a ruby-specific mailing list. A quick search shows that yes, there were many changes in Ruby 1.9.x:

Ruby 1.9.2 brings [...] major speed improvements to Ruby by way of the Yet Another Ruby VM (YARV) interpreter

Perhaps the right question is "Does YARV have x86 specific optimizations? Could these optimizations be duplicated in the ARM port?"

请叫√我孤独 2024-10-06 03:44:47

树莓派上的相同基准测试具有较新的软件包:

pi@raspberrypi:~$ uname -a
Linux raspberrypi 3.6.11+
pi@raspberrypi:~$ ruby -v
ruby 2.0.0p195 (2013-05-14 revision 40734) [armv6l-linux-eabihf]
pi@raspberrypi:~$ ruby benchmark.rb 
       user     system      total        real
   6.580000   0.000000   6.580000 (  6.585575)
  45.080000   0.000000  45.080000 ( 45.132900)
   0.000000   0.000000   0.000000 (  0.008709)
   0.090000   0.000000   0.090000 (  0.095851)
   1.040000   0.010000   1.050000 (  1.044347)

RP2 更新(2015 年):

pi@raschpi ~ $ uname -a
Linux raschpi 4.1.13-v7+ #826 SMP PREEMPT Fri Nov 13 20:19:03 GMT 2015 armv7l GNU/Linux
pi@raschpi ~ $ ruby -v
ruby 2.2.1p85 (2015-02-26 revision 49769) [armv7l-linux-eabihf]
pi@raschpi ~ $ ruby benchmark.rb 
       user     system      total        real
   4.450000   0.000000   4.450000 (  4.446841)
  30.460000   0.000000  30.460000 ( 30.473665)
   0.010000   0.000000   0.010000 (  0.002306)
   0.020000   0.000000   0.020000 (  0.023236)
   0.290000   0.000000   0.290000 (  0.292746)

RP3-B 更新(2017 年 - Raspian Jessie):

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.9.59-v7+ #1047 SMP Sun Oct 29 12:19:23 GMT 2017 armv7l GNU/Linux
pi@raspberrypi:~ $ ruby -v
ruby 2.3.3p222 (2016-11-21) [arm-linux-gnueabihf]
pi@raspberrypi:~ $ ruby -rbenchmark benchmark.rb
       user     system      total        real
   4.030000   0.000000   4.030000 (  4.032046)
  30.940000   0.000000  30.940000 ( 30.943480)
   0.000000   0.000000   0.000000 (  0.001352)
   0.000000   0.010000   0.010000 (  0.013266)
   0.260000   0.000000   0.260000 (  0.251937)

The same benchmark on raspberry pi with a bit newer packages:

pi@raspberrypi:~$ uname -a
Linux raspberrypi 3.6.11+
pi@raspberrypi:~$ ruby -v
ruby 2.0.0p195 (2013-05-14 revision 40734) [armv6l-linux-eabihf]
pi@raspberrypi:~$ ruby benchmark.rb 
       user     system      total        real
   6.580000   0.000000   6.580000 (  6.585575)
  45.080000   0.000000  45.080000 ( 45.132900)
   0.000000   0.000000   0.000000 (  0.008709)
   0.090000   0.000000   0.090000 (  0.095851)
   1.040000   0.010000   1.050000 (  1.044347)

Update for a RP2 (in 2015):

pi@raschpi ~ $ uname -a
Linux raschpi 4.1.13-v7+ #826 SMP PREEMPT Fri Nov 13 20:19:03 GMT 2015 armv7l GNU/Linux
pi@raschpi ~ $ ruby -v
ruby 2.2.1p85 (2015-02-26 revision 49769) [armv7l-linux-eabihf]
pi@raschpi ~ $ ruby benchmark.rb 
       user     system      total        real
   4.450000   0.000000   4.450000 (  4.446841)
  30.460000   0.000000  30.460000 ( 30.473665)
   0.010000   0.000000   0.010000 (  0.002306)
   0.020000   0.000000   0.020000 (  0.023236)
   0.290000   0.000000   0.290000 (  0.292746)

Update for a RP3-B (in 2017 - Raspian Jessie):

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.9.59-v7+ #1047 SMP Sun Oct 29 12:19:23 GMT 2017 armv7l GNU/Linux
pi@raspberrypi:~ $ ruby -v
ruby 2.3.3p222 (2016-11-21) [arm-linux-gnueabihf]
pi@raspberrypi:~ $ ruby -rbenchmark benchmark.rb
       user     system      total        real
   4.030000   0.000000   4.030000 (  4.032046)
  30.940000   0.000000  30.940000 ( 30.943480)
   0.000000   0.000000   0.000000 (  0.001352)
   0.000000   0.010000   0.010000 (  0.013266)
   0.260000   0.000000   0.260000 (  0.251937)
泪之魂 2024-10-06 03:44:47

使用问题示例中引用的代码,这些是我在使用 armv6l 处理器运行 Raspbian 的 Raspberry Pi 上的结果:

uname -a
Linux ginger 3.1.9+ #168 PREEMPT Sat Jul 14 18:56:31 BST 2012 armv6l GNU/Linux

ruby -v
ruby 1.9.3p194 (2012-04-20 revision 35410) [armv6l-linux-eabi]

ruby benchmark.rb
   user     system      total        real
 7.810000   0.000000   7.810000 (  7.823737)
53.520000   0.010000  53.530000 ( 53.630399)
 0.010000   0.000000   0.010000 (  0.007818)
 0.090000   0.000000   0.090000 (  0.090667)
 0.950000   0.030000   0.980000 (  0.980731)

Using the code quoted in the question's example, these are my results on a Raspberry Pi running Raspbian with an armv6l processor:

uname -a
Linux ginger 3.1.9+ #168 PREEMPT Sat Jul 14 18:56:31 BST 2012 armv6l GNU/Linux

ruby -v
ruby 1.9.3p194 (2012-04-20 revision 35410) [armv6l-linux-eabi]

ruby benchmark.rb
   user     system      total        real
 7.810000   0.000000   7.810000 (  7.823737)
53.520000   0.010000  53.530000 ( 53.630399)
 0.010000   0.000000   0.010000 (  0.007818)
 0.090000   0.000000   0.090000 (  0.090667)
 0.950000   0.030000   0.980000 (  0.980731)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文