熊猫的全名分为一个,中间和姓氏

发布于 2025-02-13 05:31:43 字数 547 浏览 2 评论 0原文

我有一个带有FullNames字段的Pandas DataFrame,我想更改逻辑,以便第一个和姓氏将具有所有第一个和最后一个字,其余的将进入中间名字段。

注意:全名可以包含两个单词,在这种情况下,中间名将为null,名称之间也可能有额外的空间。

当前逻辑:

fullnames = "Walter John  Ross Schmidt"
first, middle, *last = name.split()
print("First = {first}".format(first=first))
print("Middle = {middle}".format(middle=middle))
print("Last = {last}".format(last=" ".join(last)))

Output :

First = Walter
Middle = John
Last = Ross Schmidt

预期输出:

FirstName = Walter
Middle = John Ross
Last = Schmidt

I have a pandas dataframe with a fullnames field, I want to change the logic so that the First and Last name will have all the first and last word and the rest will go into the middle name field.

Note: The full name can contain two words in that case middle name will be null and there may be also extra spaces between the names.

Current Logic:

fullnames = "Walter John  Ross Schmidt"
first, middle, *last = name.split()
print("First = {first}".format(first=first))
print("Middle = {middle}".format(middle=middle))
print("Last = {last}".format(last=" ".join(last)))

Output :

First = Walter
Middle = John
Last = Ross Schmidt

Expected Output :

FirstName = Walter
Middle = John Ross
Last = Schmidt

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

初吻给了烟 2025-02-20 05:31:43

您可以使用否定索引获取姓氏列表中的最后一项,还可以使用切片来获取中间名的第一个和最后一个:

fullnames = "Walter John  Ross Schmidt"
first = fullnames.split()[0]
last = fullnames.split()[-1]
middle = " ".join(fullnames.split()[1:-1])
print("First = {first}".format(first=first))
print("Middle = {middle}".format(middle=middle))
print("Last = {last}".format(last=last))

PS,如果您使用数据框架,则可以使用:

df = pd.DataFrame({'fullnames':['Walter John  Ross Schmidt']})
df = df.assign(**{
    'first': df['fullnames'].str.split().str[0],
    'middle': df['fullnames'].str.split().str[1:-1].str.join(' '),
    'last': df['fullnames'].str.split().str[-1]
})

输出:输出:

   fullnames                  first   middle     last
0  Walter John  Ross Schmidt  Walter  John Ross  Schmidt

You can use negative indexing to get the last item in the list for the last name and also use a slice to get all but the first and last for the middle name:

fullnames = "Walter John  Ross Schmidt"
first = fullnames.split()[0]
last = fullnames.split()[-1]
middle = " ".join(fullnames.split()[1:-1])
print("First = {first}".format(first=first))
print("Middle = {middle}".format(middle=middle))
print("Last = {last}".format(last=last))

PS if you are working with a data frame you can use:

df = pd.DataFrame({'fullnames':['Walter John  Ross Schmidt']})
df = df.assign(**{
    'first': df['fullnames'].str.split().str[0],
    'middle': df['fullnames'].str.split().str[1:-1].str.join(' '),
    'last': df['fullnames'].str.split().str[-1]
})

Output:

   fullnames                  first   middle     last
0  Walter John  Ross Schmidt  Walter  John Ross  Schmidt
我要还你自由 2025-02-20 05:31:43

您可以使用传递给str.extract()的捕获组中的捕获组,这将使您在一个操作中执行此操作:

df = pd.DataFrame({
    "name": [
        "Walter John  Ross Schmidt",
        "John Quincy Adams"
    ]
})

rx = re.compile(r'^(\w+)\s+(.*?)\s+(\w+)

这给您:

    name                        first   middle      last
0   Walter John Ross Schmidt    Walter  John Ross   Schmidt
1   John   Quincy Adams         John    Quincy      Adams
) df[['first', 'middle', 'last']] = df['name'].str.extract(pat=rx, expand=True)

这给您:

You can use capture groups in the regex passed to str.extract(), which will let you do this in a single operation:

df = pd.DataFrame({
    "name": [
        "Walter John  Ross Schmidt",
        "John Quincy Adams"
    ]
})

rx = re.compile(r'^(\w+)\s+(.*?)\s+(\w+)

This gives you:

    name                        first   middle      last
0   Walter John Ross Schmidt    Walter  John Ross   Schmidt
1   John   Quincy Adams         John    Quincy      Adams
) df[['first', 'middle', 'last']] = df['name'].str.extract(pat=rx, expand=True)

This gives you:

橘虞初梦 2025-02-20 05:31:43

我将使用str.Replacestr.stract在这里:

df["FirstName"] = df["FullName"].str.extract(r'^(\w+)')
df["Middle"] = df["FullName"].str.replace(r'^\w+\s+|\s+\w+
, '')
df["Last"] = df["FullName"].str.extract(r'(\w+)
)

I would use str.replace and str.extract here:

df["FirstName"] = df["FullName"].str.extract(r'^(\w+)')
df["Middle"] = df["FullName"].str.replace(r'^\w+\s+|\s+\w+
, '')
df["Last"] = df["FullName"].str.extract(r'(\w+)
)
傾旎 2025-02-20 05:31:43

您可以使用以下行。

first, *middle, last = fullnames.split()

You can use the following line instead.

first, *middle, last = fullnames.split()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文