如何从 Scala 中较大的字符串中提取有效的电子邮件
我的 scala 版本 2.7.7
我试图从较大的字符串中提取电子邮件地址。字符串本身不遵循任何格式。我得到的代码:
import scala.util.matching.Regex
import scala.util.matching._
val Reg = """\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b""".r
"yo my name is joe : [email protected]" match {
case Reg(e) => println("match: " + e)
case _ => println("fail")
}
Regex 在 RegExBuilder 中传递,但在 scala 中不传递。另外,如果有另一种方法可以在不使用正则表达式的情况下执行此操作,那也可以。谢谢!
My scala version 2.7.7
Im trying to extract an email adress from a larger string. the string itself follows no format. the code i've got:
import scala.util.matching.Regex
import scala.util.matching._
val Reg = """\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b""".r
"yo my name is joe : [email protected]" match {
case Reg(e) => println("match: " + e)
case _ => println("fail")
}
the Regex passes in RegExBuilder but does not pass for scala. Also if there is another way to do this without regex that would be fine also. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
正如 Alan Moore 指出的,您需要将
(?i)
添加到模式的开头以使其不区分大小写。另请注意,使用正则表达式直接匹配整个字符串。如果您想在较大的字符串中查找一个,可以调用 findFirstIn() 或使用 Regex 的类似方法之一。As Alan Moore pointed out, you need to add the
(?i)
to the beginning of the pattern to make it case-insensitive. Also note that using the Regex directly matches the whole string. If you want to find one within a larger string, you can callfindFirstIn()
or use one of the similar methods of Regex.看起来您正在尝试进行不区分大小写的搜索,但您没有在任何地方指定这一点。尝试将
(?i)
添加到正则表达式的开头:It looks like you're trying to do a case-insensitive search, but you aren't specifying that anywhere. Try adding
(?i)
to the beginning of the regex:好吧,除了 RE 之外,其他方法可能要混乱得多。下一步可能是组合器解析器。许多随机字符串剖析代码会更加通用,而且几乎肯定会更加痛苦。在某种程度上,什么是合适的策略取决于识别器需要的完整性(以及严格或宽松的程度)。例如,您的 RE 不接受常见形式:
Rudolf Reindeer
(即使在放宽大小写敏感度之后)。对于基于 RE 的方法来说,成熟的 RFC 2822 地址解析相当具有挑战性。Well, the ways to do it other than REs are probably a lot messier. The next step up would probably the a combinator parser. A lot of random string dissection code would be even more general and almost certainly a whole lot more painful. In part what's a suitable tactic depends on how complete (and how strict or lenient) your recognizer needs to be. E.g., the common form:
Rudolf Reindeer <rudy.caribou@north_pole.rth>
is not accepted by your RE (even after the case-sensitivity is relaxed). Full-blown RFC 2822 address parsing is rather challenging for an RE-based approach.