解析“家庭”;名字变成人的名字+带有正则表达式的姓氏
给定以下字符串,我想解析为名字+姓氏的列表:
彼得-保罗、玛丽和乔尔·范德温克尔
(以及更简单的版本)
我正在尝试弄清楚是否可以使用正则表达式来做到这一点。我已经到目前为止了,
(?:([^, &]+))[, &]*(?:([^, &]+))
但这里的问题是我希望在不同的捕获中捕获姓氏。
我怀疑我超出了可能的范围,但以防万一...
更新
从组中提取捕获对我来说是新的,所以这是我使用的(C#)代码:
string familyName = "Peter-Paul, Mary & Joël Van der Winkel";
string firstperson = @"^(?<First>[-\w]+)"; //.Net syntax for named capture
string lastname = @"\s+(?<Last>.*)";
string others = @"(?:(?:\s*[,|&]\s*)(?<Others>[-\w]+))*";
var reg = new Regex(firstperson + others + lastname);
var groups = reg.Match(familyName).Groups;
Console.WriteLine("LastName=" + groups["Last"].Value);
Console.WriteLine("First person=" + groups["First"].Value);
foreach(Capture firstname in groups["Others"].Captures)
Console.WriteLine("Other person=" + firstname.Value);
我必须调整接受的稍微回答一下以使其涵盖以下情况:
彼得-保罗和约瑟夫·范德温克尔
彼得-保罗 &约瑟夫·范德温克尔
Given the following string, I'd like to parse into a list of first names + a last name:
Peter-Paul, Mary & Joël Van der Winkel
(and the simpler versions)
I'm trying to work out if I can do this with a regex. I've got this far
(?:([^, &]+))[, &]*(?:([^, &]+))
But the problem here is that I'd like the last name to be captured in a different capture.
I suspect I'm beyond what's possible, but just in case...
UPDATE
Extracting captures from the group was new for me, so here's the (C#) code I used:
string familyName = "Peter-Paul, Mary & Joël Van der Winkel";
string firstperson = @"^(?<First>[-\w]+)"; //.Net syntax for named capture
string lastname = @"\s+(?<Last>.*)";
string others = @"(?:(?:\s*[,|&]\s*)(?<Others>[-\w]+))*";
var reg = new Regex(firstperson + others + lastname);
var groups = reg.Match(familyName).Groups;
Console.WriteLine("LastName=" + groups["Last"].Value);
Console.WriteLine("First person=" + groups["First"].Value);
foreach(Capture firstname in groups["Others"].Captures)
Console.WriteLine("Other person=" + firstname.Value);
I had to tweak the accepted answer slightly to get it to cover cases such as:
Peter-Paul&Joseph Van der Winkel
Peter-Paul & Joseph Van der Winkel
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
假设名字不能是带有空格的两个单词(否则 Peter Paul Van der Winkel 无法自动解析),则适用以下规则集:(
剩下的就是姓氏。
<前><代码>^([-\w]+)(?:(?:\s?[,|&]\s)([-\w]+)\s?)*(.*)
Assuming a first name can not be two words with a space (otherwise Peter Paul Van der Winkel is not automatically parsable), then the following set of rules applies:
Everything left is the last name.
看来这可能会起作用:
Seems that this might do the trick: