在 ruby​​ 中存储正则表达式匹配?

发布于 2024-11-16 04:40:37 字数 1108 浏览 7 评论 0原文

我正在使用 ruby​​ 解析文件以更改数据格式。我创建了一个正则表达式,其中包含三个匹配组,我想将它们临时存储在变量中。我无法存储匹配项,因为一切都为零。

这是我到目前为止所读到的内容。

regex = '^"(\bhttps?://[-\w+&@#/%?=~_|$!:,.;]*[\w+&@#/%=~_|$])","(\w+|[\w._%+-]+@[\w.-]+\.[a-zA-Z]{2,4})","(\w{1,30})'

begin
  file = File.new("testfile.csv", "r")
  while (line = file.gets)
    puts line
    match_array = line.scan(/regex/)
    puts $&
  end
  file.close
end

这是我用于测试的一些示例数据。

"https://mail.google.com","Master","password1","","https://mail.google.com","",""
"https://login.sf.org","[email protected]","password2","https://login.sf.org","","ctl00$ctl00$ctl00$body$body$wacCenterStage$standardLogin$tbxUsername","ctl00$ctl00$ctl00$body$body$wacCenterStage$standardLogin$tbxPassword"
"http://www.facebook.com","Beast","12345678","https://login.facebook.com","","email","pass"
"http://www.own3d.tv","Earth","passWOrd3","http://www.own3d.tv","","user_name","user_password"

谢谢你,
低频4

I'm parsing a file with ruby to change the data formatting. I created a regex which has three match groups that I want to temporally store in variables. I'm having trouble getting the matches to be stored as everything is nil.

Here is what I have so far from what I've read.

regex = '^"(\bhttps?://[-\w+&@#/%?=~_|$!:,.;]*[\w+&@#/%=~_|$])","(\w+|[\w._%+-]+@[\w.-]+\.[a-zA-Z]{2,4})","(\w{1,30})'

begin
  file = File.new("testfile.csv", "r")
  while (line = file.gets)
    puts line
    match_array = line.scan(/regex/)
    puts 
amp;
  end
  file.close
end

Here is some sample data that I'm using for testing.

"https://mail.google.com","Master","password1","","https://mail.google.com","",""
"https://login.sf.org","[email protected]","password2","https://login.sf.org","","ctl00$ctl00$ctl00$body$body$wacCenterStage$standardLogin$tbxUsername","ctl00$ctl00$ctl00$body$body$wacCenterStage$standardLogin$tbxPassword"
"http://www.facebook.com","Beast","12345678","https://login.facebook.com","","email","pass"
"http://www.own3d.tv","Earth","passWOrd3","http://www.own3d.tv","","user_name","user_password"

Thank you,
LF4

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

你是暖光i 2024-11-23 04:40:37

这是行不通的:

match_array = line.scan(/regex/)

这只是使用文字“regex”字符串作为正则表达式,而不是 regex 变量中的内容。您可以将丑陋的大正则表达式直接放入 scan 中,或者创建一个 Regexp 实例:

regex = Regexp.new('^"(\bhttps?://[-\w+&@#/%?=~_|$!:,.;]*[\w+&@#/%=~_|$])","(\w+|[\w._%+-]+@[\w.-]+\.[a-zA-Z]{2,4})","(\w{1,30})')
# ...
match_array = line.scan(regex)

您可能应该使用 CSV 库(Ruby 附带一个:1.8.71.9) 用于解析 CSV 文件,然后对每一列应用正则表达式来自 CSV。这样你会遇到更少的引用和转义问题。

This won't work:

match_array = line.scan(/regex/)

That's just using a literal "regex" string as your regular expression, not what's in your regex variable. You can either put the big ugly regex right into your scan or create a Regexp instance:

regex = Regexp.new('^"(\bhttps?://[-\w+&@#/%?=~_|$!:,.;]*[\w+&@#/%=~_|$])","(\w+|[\w._%+-]+@[\w.-]+\.[a-zA-Z]{2,4})","(\w{1,30})')
# ...
match_array = line.scan(regex)

And you should probably use a CSV library (one comes with Ruby: 1.8.7 or 1.9) for parsing CSV files, then apply a regular expression to each column from the CSV. You'll run into fewer quoting and escaping issues that way.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文