Ruby 中用于“String#include?”的算法

发布于 2024-12-10 01:48:36 字数 92 浏览 1 评论 0 原文

有谁能够确定包含使用哪种算法? Ruby 中的方法?例如

"helloworld".include?("hello")

Is anyone able to pinpoint which algorithm is used for the include? method in Ruby? For example

"helloworld".include?("hello")

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

陈年往事 2024-12-17 01:48:36

正如 emboss 在他的回答中所述,String#include 调用 rb_str_index。该函数依次调用rb_memsearch,它实现Rabin-Karp 字符串搜索算法< /a>,根据 这篇文章 ruby-forum.com

As emboss states in his answer, String#include calls rb_str_index. This function in turn calls rb_memsearch, which implements the Rabin-Karp string search algorithm, according to this post on ruby-forum.com.

心欲静而疯不止 2024-12-17 01:48:36

Ruby 语言规范没有规定任何特定的算法。每个实现都可以使用他们想要的任何算法。

例如,在 Rubinius 中,String#include? 调用 String#find_string:

def include?(needle)
  if needle.kind_of? Fixnum
    needle = needle % 256
    str_needle = needle.chr
  else
    str_needle = StringValue(needle)
  end

  !!find_string(str_needle, 0)
end

String#find_string 依次实现通过 string_index 原语:

def find_string(pattern, start)
  Rubinius.primitive :string_index
  raise PrimitiveFailure, "String#find_string failed"
end

string_index 原语由 <代码>rubinius::String::index 函数:

// Rubinius.primitive :string_index
Fixnum* index(STATE, String* pattern, Fixnum* start);

rubinius::String::index :

Fixnum* String::index(STATE, String* pattern, Fixnum* start) {
  native_int total = size();
  native_int match_size = pattern->size();

  if(start->to_native() < 0) {
    Exception::argument_error(state, "negative start given");
  }

  switch(match_size) {
  case 0:
    return start;
  case 1:
    {
      uint8_t* buf = byte_address();
      uint8_t matcher = pattern->byte_address()[0];

      for(native_int pos = start->to_native(); pos < total; pos++) {
        if(buf[pos] == matcher) return Fixnum::from(pos);
      }
    }
    return nil<Fixnum>();
  default:
    {
      uint8_t* buf = byte_address();
      uint8_t* matcher = pattern->byte_address();

      uint8_t* last = buf + (total - match_size);
      uint8_t* pos = buf + start->to_native();

      while(pos <= last) {
        // Checking *pos directly then also checking memcmp is an
        // optimization. It's about 10x faster than just calling memcmp
        // everytime.
        if(*pos == *matcher &&
            memcmp(pos, matcher, match_size) == 0) {
          return Fixnum::from(pos - buf);
        }
        pos++;
      }
    }
    return nil<Fixnum>();
  }
}

The Ruby Language Specification doesn't prescribe any particular algorithm. Every implementation can use whatever algorithm they want.

For example, in Rubinius, String#include? calls String#find_string:

def include?(needle)
  if needle.kind_of? Fixnum
    needle = needle % 256
    str_needle = needle.chr
  else
    str_needle = StringValue(needle)
  end

  !!find_string(str_needle, 0)
end

String#find_string in turn is implemented via the string_index primitive:

def find_string(pattern, start)
  Rubinius.primitive :string_index
  raise PrimitiveFailure, "String#find_string failed"
end

The string_index primitive is implemented by the rubinius::String::index function:

// Rubinius.primitive :string_index
Fixnum* index(STATE, String* pattern, Fixnum* start);

rubinius::String::index:

Fixnum* String::index(STATE, String* pattern, Fixnum* start) {
  native_int total = size();
  native_int match_size = pattern->size();

  if(start->to_native() < 0) {
    Exception::argument_error(state, "negative start given");
  }

  switch(match_size) {
  case 0:
    return start;
  case 1:
    {
      uint8_t* buf = byte_address();
      uint8_t matcher = pattern->byte_address()[0];

      for(native_int pos = start->to_native(); pos < total; pos++) {
        if(buf[pos] == matcher) return Fixnum::from(pos);
      }
    }
    return nil<Fixnum>();
  default:
    {
      uint8_t* buf = byte_address();
      uint8_t* matcher = pattern->byte_address();

      uint8_t* last = buf + (total - match_size);
      uint8_t* pos = buf + start->to_native();

      while(pos <= last) {
        // Checking *pos directly then also checking memcmp is an
        // optimization. It's about 10x faster than just calling memcmp
        // everytime.
        if(*pos == *matcher &&
            memcmp(pos, matcher, match_size) == 0) {
          return Fixnum::from(pos - buf);
        }
        pos++;
      }
    }
    return nil<Fixnum>();
  }
}
百变从容 2024-12-17 01:48:36

这是 String#include? 的实际实现:

static VALUE
rb_str_include(VALUE str, VALUE arg)
{
    long i;

    StringValue(arg);
    i = rb_str_index(str, arg, 0);

    if (i == -1) return Qfalse;
    return Qtrue;
}

因此实际使用的算法可以在 rb_str_index

This is the actual implementation of String#include?:

static VALUE
rb_str_include(VALUE str, VALUE arg)
{
    long i;

    StringValue(arg);
    i = rb_str_index(str, arg, 0);

    if (i == -1) return Qfalse;
    return Qtrue;
}

So the actual algorithm used can be found in rb_str_index.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文