如何使用 Ruby 的“扫描”功能将捕获组设置为可选方法?

发布于 2024-12-21 07:14:19 字数 4308 浏览 1 评论 0 原文

我有一个像这样的正则表达式:

(.*?)("DisplayName":.*?)(,)(.*?"Groups":?)?(\[.*?\])?(,)(.*?"Phones":)?(\[.*?\])?(.*?\},)?

我想用它来处理这样的字符串:

{"Affinity":20,"DisplayName":"Moe Larry","电子邮件":[{"地址":"[电子邮件受保护]","Primary":tru e,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructed":"Moe拉里"},"群组":[{"id":"^我的"}],"Id":"1234567890","MailsS​​ent":0,"姓名":"Moe拉里","电话":[{"号码":"555-999-6661","类型":{"Id":"移动"}}],"ProfileLink":""},{"亲和力": 20、"DisplayName":"stoogesarefunny","电子邮件":[{"地址":"stoogesarefunny","主要":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf893748 9213749832654","Id":"834754hthbf83744823f","邮件已发送":0},{"Affinity":20,"DisplayName":"[电子邮件受保护]","电子邮件":[{"地址":"<一个href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="98ebecf7f7fffdebf9eafdfeedf6f6e1d8fff5f9f1f4b6fbf7f5">[电子邮件受保护]","Primary":true}],"EvergreenPhoto":"/photos/私人/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsS​​ent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp"," U nstructed":"Shemp"},"群组":[{"id":"^我的"}],"Id":"1234567890","MailsS​​ent":0,"姓名":"Shemp","电话" :[{"号码":"+15553085671","典型值e":{"Id":"OTHER"}}]},{"Affinity":20,"DisplayName":"ClownFace","FullName":{"GivenName":"ClownFace","Unstructed":"ClownFace "},"组":[{"id":"^我的"}],"Id":"1234567890","MailsS​​ent":0,"姓名":"ClownFace","电话":[{"号码":"+15556064040 ","Type":{"Id":"OTHER"}}]},

我知道这真的很丑陋。我希望我能找到一个 xml feed,但现在这不是一个选择。

我只关心 DisplayName、Groups 和 Phones。我需要提取它们并将其保存在数组中。组和电话的捕获组必须是可选的,因为并非所有联系人都有它们。然而,我的正则表达式给了我:

Result 1

1. {"Affinity":20,
2. "DisplayName":"Moe Larry"
3. ,
4. "Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Moe Larry","Phones":
8. [{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}]
9. ,"ProfileLink":""},

Result 2

1. {"Affinity":20,
2. "DisplayName":"stoogesarefunny"
3. ,
4. "Emails":[{"Address":"stoogesarefunny","Primary":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf8937489213749832654","Id":"834754hthbf83744823f","MailsSent":0},{"Affinity":20,"DisplayName":"[email protected]","Emails":[{"Address":"[email protected]","Primary":true}],"EvergreenPhoto":"/photos/private/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp","Unstructured":"Shemp"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Shemp","Phones":
8. [{"Number":"+15553085671","Type":{"Id":"OTHER"}}]
9. },

Result 3

1. {"Affinity":20,
2. "DisplayName":"ClownFace"
3. ,
4. "FullName":{"GivenName":"ClownFace","Unstructured":"ClownFace"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"ClownFace","Phones":
8. [{"Number":"+15556064040","Type":{"Id":"OTHER"}}]
9. },

显然,Shemp 的所有联系数据都被纳入 [email 受保护]的数据因为我的正则表达式会继续咀嚼直到到达 Shemps Group,而不是在他的显示名称之前停止并重新开始。帮助?

PS:不,我并不打算拯救所有这些群体,最终,这只是为了我可以研究正在发生的事情。

I have a regex like this:

(.*?)("DisplayName":.*?)(,)(.*?"Groups":?)?(\[.*?\])?(,)(.*?"Phones":)?(\[.*?\])?(.*?\},)?

with which I want to process a string like this:

{"Affinity":20,"DisplayName":"Moe Larry","Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},"Groups":[{"id":"^Mine"}],"Id":"1234567890","MailsSent":0,"Name":"Moe Larry","Phones":[{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}],"ProfileLink":""},{"Affinity":20,"DisplayName":"stoogesarefunny","Emails":[{"Address":"stoogesarefunny","Primary":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf8937489213749832654","Id":"834754hthbf83744823f","MailsSent":0},{"Affinity":20,"DisplayName":"[email protected]","Emails":[{"Address":"[email protected]","Primary":true}],"EvergreenPhoto":"/photos/private/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp","Unstructured":"Shemp"},"Groups":[{"id":"^Mine"}],"Id":"1234567890","MailsSent":0,"Name":"Shemp","Phones":[{"Number":"+15553085671","Type":{"Id":"OTHER"}}]},{"Affinity":20,"DisplayName":"ClownFace","FullName":{"GivenName":"ClownFace","Unstructured":"ClownFace"},"Groups":[{"id":"^Mine"}],"Id":"1234567890","MailsSent":0,"Name":"ClownFace","Phones":[{"Number":"+15556064040","Type":{"Id":"OTHER"}}]},

It's really effing ugly, I know. I wish I could find an xml feed, but that's not an option right now.

All I care about are DisplayName, Groups, and Phones. I need to extract and save them in an array of arrays. The capturing groups for Groups and Phones need to be optional because not all contacts have them. However, my regex gives me:

Result 1

1. {"Affinity":20,
2. "DisplayName":"Moe Larry"
3. ,
4. "Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Moe Larry","Phones":
8. [{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}]
9. ,"ProfileLink":""},

Result 2

1. {"Affinity":20,
2. "DisplayName":"stoogesarefunny"
3. ,
4. "Emails":[{"Address":"stoogesarefunny","Primary":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf8937489213749832654","Id":"834754hthbf83744823f","MailsSent":0},{"Affinity":20,"DisplayName":"[email protected]","Emails":[{"Address":"[email protected]","Primary":true}],"EvergreenPhoto":"/photos/private/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp","Unstructured":"Shemp"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Shemp","Phones":
8. [{"Number":"+15553085671","Type":{"Id":"OTHER"}}]
9. },

Result 3

1. {"Affinity":20,
2. "DisplayName":"ClownFace"
3. ,
4. "FullName":{"GivenName":"ClownFace","Unstructured":"ClownFace"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"ClownFace","Phones":
8. [{"Number":"+15556064040","Type":{"Id":"OTHER"}}]
9. },

Clearly, all of Shemp's contact data is being subsumed into [email protected]'s data because my regex continues chomping away until it gets to Shemps Group instead of stopping before his Display Name and starting over. Help?

P.S.: No, I don't plan to save all these groups, ultimately, it's just so I can study what is going on.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

给不了的爱 2024-12-28 07:14:19

您的输入看起来像 JSON ,其中已经有用于 Ruby 的解析器:

gem install json

然后在 ruby​​ 中:

data = JSON.parse(string)

您可以然后直接以哈希对象的形式访问data,例如:

data = '
  {"Affinity":20,
    "DisplayName":"Moe Larry",
    "Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],
    "FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},
    "Groups":[{"id":"^Mine"}],
    "Id":"1234567890",
    "MailsSent":0,
    "Name":"Moe Larry",
    "Phones":[{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}],
    "ProfileLink":""
  }
'

require 'json'
user = JSON.parse(data)
user.class                    # => Hash
user.keys                     # => ["Affinity", "DisplayName", "Emails", "FullName", "Groups", "Id", "MailsSent", "Name", "Phones", "ProfileLink"]
user['Affinity']              # => 20
user['DisplayName']           # => "Moe Larry"
user['Emails']                # => [{"Address"=>"[email protected]", "Primary"=>true, "Type"=>{"Id"=>"HOME"}}]
user['Emails'].class          # => Array
user['Emails'][0]             # => {"Address"=>"[email protected]", "Primary"=>true, "Type"=>{"Id"=>"HOME"}}
user['Emails'][0]['Address']  # => "[email protected]"

Your input looks like JSON for which there are already parsers for Ruby:

gem install json

Then in ruby:

data = JSON.parse(string)

You can then access data directly as a hash object, for example:

data = '
  {"Affinity":20,
    "DisplayName":"Moe Larry",
    "Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],
    "FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},
    "Groups":[{"id":"^Mine"}],
    "Id":"1234567890",
    "MailsSent":0,
    "Name":"Moe Larry",
    "Phones":[{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}],
    "ProfileLink":""
  }
'

require 'json'
user = JSON.parse(data)
user.class                    # => Hash
user.keys                     # => ["Affinity", "DisplayName", "Emails", "FullName", "Groups", "Id", "MailsSent", "Name", "Phones", "ProfileLink"]
user['Affinity']              # => 20
user['DisplayName']           # => "Moe Larry"
user['Emails']                # => [{"Address"=>"[email protected]", "Primary"=>true, "Type"=>{"Id"=>"HOME"}}]
user['Emails'].class          # => Array
user['Emails'][0]             # => {"Address"=>"[email protected]", "Primary"=>true, "Type"=>{"Id"=>"HOME"}}
user['Emails'][0]['Address']  # => "[email protected]"
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文