我有一个像这样的正则表达式:
(.*?)("DisplayName":.*?)(,)(.*?"Groups":?)?(\[.*?\])?(,)(.*?"Phones":)?(\[.*?\])?(.*?\},)?
我想用它来处理这样的字符串:
{"Affinity":20,"DisplayName":"Moe Larry","电子邮件":[{"地址":"[电子邮件受保护]","Primary":tru e,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructed":"Moe拉里"},"群组":[{"id":"^我的"}],"Id":"1234567890","MailsSent":0,"姓名":"Moe拉里","电话":[{"号码":"555-999-6661","类型":{"Id":"移动"}}],"ProfileLink":""},{"亲和力": 20、"DisplayName":"stoogesarefunny","电子邮件":[{"地址":"stoogesarefunny","主要":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf893748 9213749832654","Id":"834754hthbf83744823f","邮件已发送":0},{"Affinity":20,"DisplayName":"[电子邮件受保护]","电子邮件":[{"地址":"<一个href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="98ebecf7f7fffdebf9eafdfeedf6f6e1d8fff5f9f1f4b6fbf7f5">[电子邮件受保护]","Primary":true}],"EvergreenPhoto":"/photos/私人/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp"," U nstructed":"Shemp"},"群组":[{"id":"^我的"}],"Id":"1234567890","MailsSent":0,"姓名":"Shemp","电话" :[{"号码":"+15553085671","典型值e":{"Id":"OTHER"}}]},{"Affinity":20,"DisplayName":"ClownFace","FullName":{"GivenName":"ClownFace","Unstructed":"ClownFace "},"组":[{"id":"^我的"}],"Id":"1234567890","MailsSent":0,"姓名":"ClownFace","电话":[{"号码":"+15556064040 ","Type":{"Id":"OTHER"}}]},
我知道这真的很丑陋。我希望我能找到一个 xml feed,但现在这不是一个选择。
我只关心 DisplayName、Groups 和 Phones。我需要提取它们并将其保存在数组中。组和电话的捕获组必须是可选的,因为并非所有联系人都有它们。然而,我的正则表达式给了我:
Result 1
1. {"Affinity":20,
2. "DisplayName":"Moe Larry"
3. ,
4. "Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Moe Larry","Phones":
8. [{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}]
9. ,"ProfileLink":""},
Result 2
1. {"Affinity":20,
2. "DisplayName":"stoogesarefunny"
3. ,
4. "Emails":[{"Address":"stoogesarefunny","Primary":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf8937489213749832654","Id":"834754hthbf83744823f","MailsSent":0},{"Affinity":20,"DisplayName":"[email protected]","Emails":[{"Address":"[email protected]","Primary":true}],"EvergreenPhoto":"/photos/private/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp","Unstructured":"Shemp"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Shemp","Phones":
8. [{"Number":"+15553085671","Type":{"Id":"OTHER"}}]
9. },
Result 3
1. {"Affinity":20,
2. "DisplayName":"ClownFace"
3. ,
4. "FullName":{"GivenName":"ClownFace","Unstructured":"ClownFace"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"ClownFace","Phones":
8. [{"Number":"+15556064040","Type":{"Id":"OTHER"}}]
9. },
显然,Shemp 的所有联系数据都被纳入 [email 受保护]的数据因为我的正则表达式会继续咀嚼直到到达 Shemps Group,而不是在他的显示名称之前停止并重新开始。帮助?
PS:不,我并不打算拯救所有这些群体,最终,这只是为了我可以研究正在发生的事情。
I have a regex like this:
(.*?)("DisplayName":.*?)(,)(.*?"Groups":?)?(\[.*?\])?(,)(.*?"Phones":)?(\[.*?\])?(.*?\},)?
with which I want to process a string like this:
{"Affinity":20,"DisplayName":"Moe Larry","Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},"Groups":[{"id":"^Mine"}],"Id":"1234567890","MailsSent":0,"Name":"Moe Larry","Phones":[{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}],"ProfileLink":""},{"Affinity":20,"DisplayName":"stoogesarefunny","Emails":[{"Address":"stoogesarefunny","Primary":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf8937489213749832654","Id":"834754hthbf83744823f","MailsSent":0},{"Affinity":20,"DisplayName":"[email protected]","Emails":[{"Address":"[email protected]","Primary":true}],"EvergreenPhoto":"/photos/private/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp","Unstructured":"Shemp"},"Groups":[{"id":"^Mine"}],"Id":"1234567890","MailsSent":0,"Name":"Shemp","Phones":[{"Number":"+15553085671","Type":{"Id":"OTHER"}}]},{"Affinity":20,"DisplayName":"ClownFace","FullName":{"GivenName":"ClownFace","Unstructured":"ClownFace"},"Groups":[{"id":"^Mine"}],"Id":"1234567890","MailsSent":0,"Name":"ClownFace","Phones":[{"Number":"+15556064040","Type":{"Id":"OTHER"}}]},
It's really effing ugly, I know. I wish I could find an xml feed, but that's not an option right now.
All I care about are DisplayName, Groups, and Phones. I need to extract and save them in an array of arrays. The capturing groups for Groups and Phones need to be optional because not all contacts have them. However, my regex gives me:
Result 1
1. {"Affinity":20,
2. "DisplayName":"Moe Larry"
3. ,
4. "Emails":[{"Address":"[email protected]","Primary":true,"Type":{"Id":"HOME"}}],"FullName":{"FamilyName":"Larry","GivenName":"Moe","Unstructured":"Moe Larry"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Moe Larry","Phones":
8. [{"Number":"555-999-6661","Type":{"Id":"MOBILE"}}]
9. ,"ProfileLink":""},
Result 2
1. {"Affinity":20,
2. "DisplayName":"stoogesarefunny"
3. ,
4. "Emails":[{"Address":"stoogesarefunny","Primary":true}],"EvergreenPhoto":"/photos/private/adflk;jsd394u75430o8752380974321jtkasdljf8937489213749832654","Id":"834754hthbf83744823f","MailsSent":0},{"Affinity":20,"DisplayName":"[email protected]","Emails":[{"Address":"[email protected]","Primary":true}],"EvergreenPhoto":"/photos/private/asdfAJDKLJSFIOEJHLTHSJKLDF234987s897KJHSDFKJHDF89273473ASLKJDLSKJIFEIH","Id":"834754hthbf83744823f","MailsSent":0,"ProfileLink":"https://profiles.google.com/stoogesarefunny"},{"Affinity":20,"DisplayName":"Shemp","FullName":{"GivenName":"Shemp","Unstructured":"Shemp"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"Shemp","Phones":
8. [{"Number":"+15553085671","Type":{"Id":"OTHER"}}]
9. },
Result 3
1. {"Affinity":20,
2. "DisplayName":"ClownFace"
3. ,
4. "FullName":{"GivenName":"ClownFace","Unstructured":"ClownFace"},"Groups":
5. [{"id":"^Mine"}]
6. ,
7. "Id":"1234567890","MailsSent":0,"Name":"ClownFace","Phones":
8. [{"Number":"+15556064040","Type":{"Id":"OTHER"}}]
9. },
Clearly, all of Shemp's contact data is being subsumed into [email protected]'s data because my regex continues chomping away until it gets to Shemps Group instead of stopping before his Display Name and starting over. Help?
P.S.: No, I don't plan to save all these groups, ultimately, it's just so I can study what is going on.
发布评论
评论(1)
您的输入看起来像 JSON ,其中已经有用于 Ruby 的解析器:
然后在 ruby 中:
您可以然后直接以哈希对象的形式访问
data
,例如:Your input looks like JSON for which there are already parsers for Ruby:
Then in ruby:
You can then access
data
directly as a hash object, for example: