Python如何删除字符串中不需要的子字符串

发布于 2025-01-14 01:56:24 字数 719 浏览 2 评论 0原文

我的数据集中有这一列,其中包含以下数据样本:

player
David Johnson*+\JohnDa08
Kareem Hunt*\HuntKa00
Melvin Gordon\GordMe00

我正在尝试使用 Python 使其看起来像这样:

player
David Johnson
Kareem Hunt
梅尔文·戈登

请帮忙。

I have this column in my dataset with that contains the following data sample:

player
David Johnson*+\JohnDa08
Kareem Hunt*\HuntKa00
Melvin Gordon\GordMe00

and I'm trying to make it look like this using Python:

player
David Johnson
Kareem Hunt
Melvin Gordon

Please help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

烟若柳尘 2025-01-21 01:56:24

在Python中,当你有一个str并且你想删除一个子字符串时,你可以使用.replace方法:

>>> a = "Hello!"
>>> a.replace("Hello", '')
'!'
>>> a = a.replace("Hello", '')

在你的情况下,最简单的事情是这样的:

>>> s = "Kareem Hunt*\\HuntKa00"
>>> s = s.replace("*\\", ' ').replace("00", '')

或者,更确定要从字符串后面删除 00

>>> s = s.replace("*\\", ' ').removesuffix("00")

因为您的字符串并非全部以 00 结尾,但其他一些字符串以 结尾>08,我建议这样做:

>>> s = s.replace("*\\", ' ')[:-2]

排除最后一个字符串中的两个字符

In Python when you have a str and you want to remove a substring you can use the .replace method:

>>> a = "Hello!"
>>> a.replace("Hello", '')
'!'
>>> a = a.replace("Hello", '')

In your case the most simple thing to do is this:

>>> s = "Kareem Hunt*\\HuntKa00"
>>> s = s.replace("*\\", ' ').replace("00", '')

Or, to be more sure about the 00 to be removed from the back of the string:

>>> s = s.replace("*\\", ' ').removesuffix("00")

Since your strings are not all ending with 00 but some others are ending for example with 08, I would suggest this:

>>> s = s.replace("*\\", ' ')[:-2]

which excludes the last two characters from the string

踏月而来 2025-01-21 01:56:24

您可以在第一个特殊字符上split并获取第一个块:

df['player'] = df['player'].str.split(r'[^\w ]', n=1).str[0]

或者,使用replace

df['player'] = df['player'].str.replace(r'[^\w ].*

输出:

          player
0  David Johnson
1    Kareem Hunt
2  Melvin Gordon
, '', regex=True)

输出:

You can split on the first special character and get the first chunk:

df['player'] = df['player'].str.split(r'[^\w ]', n=1).str[0]

Or, using replace:

df['player'] = df['player'].str.replace(r'[^\w ].*

Output:

          player
0  David Johnson
1    Kareem Hunt
2  Melvin Gordon
, '', regex=True)

Output:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文