Scala Regex 启用多行选项

发布于 2024-07-26 10:07:50 字数 785 浏览 2 评论 0原文

我正在学习 Scala,所以这可能相当新手。

我想要一个多行正则表达式。

在 Ruby 中,它会是:

MY_REGEX = /com:Node/m

我的 Scala 看起来像:

val ScriptNode =  new Regex("""<com:Node>""")

这是我的匹配函数:

def matchNode( value : String ) : Boolean = value match 
{
    case ScriptNode() => System.out.println( "found" + value ); true
    case _ => System.out.println("not found: " + value ) ; false
}

我这样称呼它:

matchNode( "<root>\n<com:Node>\n</root>" ) // doesn't work
matchNode( "<com:Node>" ) // works

我已经尝试过了:

val ScriptNode =  new Regex("""<com:Node>?m""")

我真的很想避免使用 java.util.regex.Pattern。 任何提示都非常感激。

I'm learning Scala, so this is probably pretty noob-irific.

I want to have a multiline regular expression.

In Ruby it would be:

MY_REGEX = /com:Node/m

My Scala looks like:

val ScriptNode =  new Regex("""<com:Node>""")

Here's my match function:

def matchNode( value : String ) : Boolean = value match 
{
    case ScriptNode() => System.out.println( "found" + value ); true
    case _ => System.out.println("not found: " + value ) ; false
}

And I'm calling it like so:

matchNode( "<root>\n<com:Node>\n</root>" ) // doesn't work
matchNode( "<com:Node>" ) // works

I've tried:

val ScriptNode =  new Regex("""<com:Node>?m""")

And I'd really like to avoid having to use java.util.regex.Pattern. Any tips greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

坠似风落 2024-08-02 10:07:51

这是第一次使用 Scala Regex 时非常常见的问题。

当您在 Scala 中使用模式匹配时,它会尝试匹配整个字符串,就像您使用“^”和“$”一样(并且没有激活多行解析,它将 \n 匹配到 ^ 和 $)。

执行您想要的操作的方法是以下之一:

def matchNode( value : String ) : Boolean = 
  (ScriptNode findFirstIn value) match {    
    case Some(v) => println( "found" + v ); true    
    case None => println("not found: " + value ) ; false
  }

这会找到值中 ScriptNode 的第一个实例,并将该实例返回为 v (如果您想要整个字符串,只需打印值)。 否则:

val ScriptNode =  new Regex("""(?s).*<com:Node>.*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode() => println( "found" + value ); true    
    case _ => println("not found: " + value ) ; false
  }

这将打印所有所有值。 在此示例中,(?s) 激活 dotall 匹配(即,将“.”匹配到新行),并且搜索模式前后的 .* 确保它将匹配任何字符串。 如果您想要像第一个示例中那样的“v”,您可以这样做:

val ScriptNode =  new Regex("""(?s).*(<com:Node>).*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode(v) => println( "found" + v ); true    
    case _ => println("not found: " + value ) ; false
  }

This is a very common problem when first using Scala Regex.

When you use pattern matching in Scala, it tries to match the whole string, as if you were using "^" and "$" (and did not activate multi-line parsing, which matches \n to ^ and $).

The way to do what you want would be one of the following:

def matchNode( value : String ) : Boolean = 
  (ScriptNode findFirstIn value) match {    
    case Some(v) => println( "found" + v ); true    
    case None => println("not found: " + value ) ; false
  }

Which would find find the first instance of ScriptNode inside value, and return that instance as v (if you want the whole string, just print value). Or else:

val ScriptNode =  new Regex("""(?s).*<com:Node>.*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode() => println( "found" + value ); true    
    case _ => println("not found: " + value ) ; false
  }

Which would print all all value. In this example, (?s) activates dotall matching (ie, matching "." to new lines), and the .* before and after the searched-for pattern ensures it will match any string. If you wanted "v" as in the first example, you could do this:

val ScriptNode =  new Regex("""(?s).*(<com:Node>).*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode(v) => println( "found" + v ); true    
    case _ => println("not found: " + value ) ; false
  }
情场扛把子 2024-08-02 10:07:51

只是一个快速而肮脏的附录:RichString 上的 .r 方法将所有字符串转换为 scala.util.matching.Regex,所以你可以这样做像这样的东西:

"""(?s)a.*b""".r replaceAllIn ( "a\nb\nc\n", "A\nB" )

这将返回

A
B
c

我一直在scala控制台中使用它来进行快速而肮脏的正则表达式脚本编写。

或者在这种情况下:

def matchNode( value : String ) : Boolean = {

    """(?s).*(<com:Node>).*""".r.findAllIn( text ) match {

       case ScriptNode(v) => System.out.println( "found" + v ); true    

       case _ => System.out.println("not found: " + value ) ; false
    }
}

我只是尝试减少全世界代码中“new”一词的使用。 ;)

Just a quick and dirty addendum: the .r method on RichString converts all strings to scala.util.matching.Regex, so you can do something like this:

"""(?s)a.*b""".r replaceAllIn ( "a\nb\nc\n", "A\nB" )

And that will return

A
B
c

I use this all the time for quick and dirty regex-scripting in the scala console.

Or in this case:

def matchNode( value : String ) : Boolean = {

    """(?s).*(<com:Node>).*""".r.findAllIn( text ) match {

       case ScriptNode(v) => System.out.println( "found" + v ); true    

       case _ => System.out.println("not found: " + value ) ; false
    }
}

Just my attempt to reduce the use of the word new in code worldwide. ;)

彩虹直至黑白 2024-08-02 10:07:51

只是一个小补充, use 尝试使用 (?m) (Multiline) 标志(尽管它可能不适合这里),但这里是使用它的正确方法:

例如而不是

val ScriptNode =  new Regex("""<com:Node>?m""")

use

val ScriptNode =  new Regex("""(?m)<com:Node>""")

但再次(?s) 标志更适合这个问题(添加这个答案只是因为标题是“Scala Regex启用多行选项”)

Just a small addition, use tried to use the (?m) (Multiline) flag (although it might not be suitable here) but here is the right way to use it:

e.g. instead of

val ScriptNode =  new Regex("""<com:Node>?m""")

use

val ScriptNode =  new Regex("""(?m)<com:Node>""")

But again the (?s) flag is more suitable in this question (adding this answer only because the title is "Scala Regex enable Multiline option")

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文