我有一个输入语料库,可能是以下格式:
名称:ABC DEF日期:8-01-09年龄:5
(名称 +姓氏)//预期输出:ABC DEF
名称:ABC日期:8-01-09年龄:5
(仅出现名称)//预期输出:ABC
名称ABC日期8-01-09年龄5
(没有COLON标签之后)//预期输出:ABC
日期8-01-09名称ABC Def Age 5
(随机位置中的名称标签)//预期输出:ABC DEF
当前解决方案:我能够进行硬码它要搜索名称并以单词为止直到第一个空间。但是我不确定如何在名字+姓氏(本质上是在下一个标签之前)提取。
任何帮助将不胜感激。谢谢!
I have an input corpus which could be of the following formats:
Name: ABC Def Date: 8-01-09 Age: 5
(First Name + Last Name) //Expected Output: ABC Def
Name: Abc Date: 8-01-09 Age: 5
(Only first name present) //Expected Output: Abc
Name ABC Date 8-01-09 Age 5
(No colon after tags) //Expected Output: ABC
Date 8-01-09 Name ABC DEF Age 5
(Name tag in a random location) //Expected Output: ABC DEF
Current Solution: I am able to hardcode it to search for name and take until the word until the first space. But I am not sure how to extract in cases of First Name+Last Name (essentially until the next tag)
Any help will be much appreciated. Thanks!
发布评论
评论(1)
尝试( regex101 ):
打印:
Try (regex101):
Prints: