挑战:用于类似 shell 分配的配置行的仅正则表达式标记器

发布于 2024-12-23 16:27:41 字数 1595 浏览 1 评论 0 原文

我在这里问了原来的问题 ,并得到了混合 Ruby 和正则表达式的实际响应。现在,我内心的纯粹主义者想知道:可以这可以用正则表达式完成吗?我的直觉告诉我可以。 bash 2.0 中有一个 ABNF,尽管它不包括字符串转义。

规范

给定一个输入行,该输入行可以是 (1) 来自 bash 风格脚本的变量(“键”)赋值,或 (2) 来自典型配置文件(如 postgresql.conf,这个正则表达式(或一对正则表达式)应该以这样的方式捕获键和值,以便我可以使用这些捕获来替换该键的新值。

您可以对 shell 风格和配置风格的行使用不同的正则表达式;调用者将知道使用哪个。

这里将有 50 点赏金。 我在两天内无法添加赏金,所以在此之前我不会接受答案,但您可以立即开始回答。您可以通过以下方面获得积分:

  • 可读性(命名捕获组,通过 ?(DEFINE) 或 {0} 进行定义)
  • 使用单个正则表达式而不是两个
  • 教我一些有关 DFA
  • 正则表达式性能的知识(如果相关)
  • 获得支持
  • 首先使用技术

示例:

给定输入,

export RAILS_ENV=production

我应该能够用 Ruby 编写:

match = THE_REGEX.match("export RAILS_ENV=production")
newline = "export #{match[:key]}=#{match[:value]}"

测试用例:shell 样式

RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

测试用例:配置样式

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

出于本次挑战的目的,“常规表达”和“regex” 的意思是相同的,两者都可以指代您喜欢的任何常见风格,但我更喜欢 Ruby 1.9 兼容。

I asked the original question here, and got a practical response with mixed Ruby and Regular Expressions. Now, the purist in me wants know: Can this be done in regular expressions? My gut says it can. There's an ABNF floating around for bash 2.0, though it doesn't include string escapes.

The Spec

Given an input line that is either (1) a variable ("key") assignment from a bash-flavored script or (2) a key-value setting from a typical configuration file like postgresql.conf, this regex (or pair of regexen) should capture the key and value in such a way that I can use those captures to substitute a new value for that key.

You may use a different regular expression for shell-flavored and config-flavored lines; the caller will know which to use.

There will be a 50-point bounty here. I can't add a bounty for two days, so I won't accept an answer till then, but you can start answering immediately. You earn points for:

  • Readability (named capture groups, definitions via ?(DEFINE) or {0})
  • Using a single regex instead of two
  • Teaching me something about DFA
  • Regex performance, if relevant
  • Getting upvoted
  • First to use a technique

Example:

Given the input

export RAILS_ENV=production

I should be able to write in Ruby:

match = THE_REGEX.match("export RAILS_ENV=production")
newline = "export #{match[:key]}=#{match[:value]}"

Test cases: shell style

RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

Test cases: config style

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

For the purpose of this challenge, "regular expression" and "regex" mean the same thing and both can refer to any common flavor you like, though I prefer Ruby 1.9-compatible.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

何必那么矫情 2024-12-30 16:27:41

我不确定完整的规格以及您在值捕获组中到底想要什么,但这应该适用于您的测试用例:

/
^\s*+

(?:export\s++)?
(?<key>\w++)

\s*+
=
\s*+

(?<value>
  (?>  "(?:[^"\\]+|\\.)*+"
  |    '(?:[^'\\]+|\\.)*+'
  |    `(?:[^`\\]+|\\.)*+`
  |    [^#\n\r]++
  )
)

\s*+
(?:#.*+)?
$
/mx;

处理带转义的注释和引号。

Perl/PCRE 风格和引用。


Perl 中的用法示例:

my $re = qr/
    ^\s*+

    (?:export\s++)?
    (?<key>\w++)

    \s*+
    =
    \s*+

    (?<value>
      (?>  "(?:[^"\\]+|\\.)*+"
      |    '(?:[^'\\]+|\\.)*+'
      |    `(?:[^`\\]+|\\.)*+`
      |    [^#\n\r]++
      )
    )

    \s*+
    (?:\#.*+)?
    $
/mx;

my $str = <<'_TESTS_';
RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

TEST="foo'bar\"baz#"
TEST='foo\'bar"baz#\\'
_TESTS_


for(split /[\r\n]+/, $str){
    print "line: $_\n";
    print /$re/? "match: $1, $2\n": "no match\n";
    print "\n";
}

输出:

line: RAILS_ENV=development     # Don't forget to change this for TechCrunch
match: RAILS_ENV, development

line: HOSTNAME=`cat /etc/hostname`
match: HOSTNAME, `cat /etc/hostname`

line: plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`
match: plist, `cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

line: # Optional bonus input: "#" present in the string
no match

line: FORMAT="  ##0.00 passe\`" #comment
match: FORMAT, "  ##0.00 passe\`"

line: listen_addresses = 127.0.0.1 #localhost only by default
match: listen_addresses, 127.0.0.1

line: # listen_addresses = 0.0.0.0 commented out, should not match
no match

line: TEST="foo'bar\"baz#"
match: TEST, "foo'bar\"baz#"

line: TEST='foo\'bar"baz#\\'
match: TEST, 'foo\'bar"baz#\\'

I'm not sure about the full specs and what exactly you want in the value capturing group, but this should work for your test cases:

/
^\s*+

(?:export\s++)?
(?<key>\w++)

\s*+
=
\s*+

(?<value>
  (?>  "(?:[^"\\]+|\\.)*+"
  |    '(?:[^'\\]+|\\.)*+'
  |    `(?:[^`\\]+|\\.)*+`
  |    [^#\n\r]++
  )
)

\s*+
(?:#.*+)?
$
/mx;

Handles comments and quotes with escapes.

Perl/PCRE flavor and quoting.


Example usage in Perl:

my $re = qr/
    ^\s*+

    (?:export\s++)?
    (?<key>\w++)

    \s*+
    =
    \s*+

    (?<value>
      (?>  "(?:[^"\\]+|\\.)*+"
      |    '(?:[^'\\]+|\\.)*+'
      |    `(?:[^`\\]+|\\.)*+`
      |    [^#\n\r]++
      )
    )

    \s*+
    (?:\#.*+)?
    $
/mx;

my $str = <<'_TESTS_';
RAILS_ENV=development     # Don't forget to change this for TechCrunch
HOSTNAME=`cat /etc/hostname`
plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

# Optional bonus input: "#" present in the string
FORMAT="  ##0.00 passe\`" #comment

listen_addresses = 127.0.0.1 #localhost only by default
# listen_addresses = 0.0.0.0 commented out, should not match

TEST="foo'bar\"baz#"
TEST='foo\'bar"baz#\\'
_TESTS_


for(split /[\r\n]+/, $str){
    print "line: $_\n";
    print /$re/? "match: $1, $2\n": "no match\n";
    print "\n";
}

Output:

line: RAILS_ENV=development     # Don't forget to change this for TechCrunch
match: RAILS_ENV, development

line: HOSTNAME=`cat /etc/hostname`
match: HOSTNAME, `cat /etc/hostname`

line: plist=`cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`
match: plist, `cat "/Applications/Sublim\`e Text 2.app/Content's/Info.plist"`

line: # Optional bonus input: "#" present in the string
no match

line: FORMAT="  ##0.00 passe\`" #comment
match: FORMAT, "  ##0.00 passe\`"

line: listen_addresses = 127.0.0.1 #localhost only by default
match: listen_addresses, 127.0.0.1

line: # listen_addresses = 0.0.0.0 commented out, should not match
no match

line: TEST="foo'bar\"baz#"
match: TEST, "foo'bar\"baz#"

line: TEST='foo\'bar"baz#\\'
match: TEST, 'foo\'bar"baz#\\'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文