re.split()的问题,从字符串中提取数据(分开字符串)
我一直在尝试将此字符串分开,但它只给了我想要的用户名的最后一个字符。例如
在此数据集中,我想将用户名与实际消息分开,但是执行此代码后 -
#how can we separate users from messages
users = []
messages = []
for message in df['user_message']:
entry = re.split('([a-zA-Z]|[0-9])+#[0-9]+\\n', message)
if entry[1:]:
users.append(entry[1])
messages.append(entry[2])
else:
users.append('notif')
messages.append(entry[0])
df['user'] = users
df['message'] = messages
df.drop(columns=['user_message'], inplace = True)
df.head(30)
我只能获取
有人可以告诉我,为什么它只给我我想拆分的字符串的最后一个字符以及如何修复它?多谢。这意味着很多
I have been trying to split this string but it only gives me the last character of the username I want. for example
in this dataset I want to separate the username from the actual message but after doing this code-
#how can we separate users from messages
users = []
messages = []
for message in df['user_message']:
entry = re.split('([a-zA-Z]|[0-9])+#[0-9]+\\n', message)
if entry[1:]:
users.append(entry[1])
messages.append(entry[2])
else:
users.append('notif')
messages.append(entry[0])
df['user'] = users
df['message'] = messages
df.drop(columns=['user_message'], inplace = True)
df.head(30)
I only get
Could someone please tell me why it only gives me the last character of the string i want to split and how I can fix it? thanks a lot. This means a lot
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
分裂并不是您在这里想要的字符串操作。相反,只需直接在
user_message
列上使用str.stract
:上面的逻辑将从一开始就提取用户消息的主要部分,直到到达第一个哈希符号。
Splitting is not really the string operation you want here. Instead, just use
str.extract
directly on theuser_message
column:The above logic will extract the leading part of the user message, from the beginning, until reaching the first hash symbol.
您只需使用
string.split()
并将maxsplit
设置为1。请参见下面的示例。请注意,Regex非常有用,但是很容易获得错误的结果。如果您确实需要使用它,我建议使用在线正则验证器。至于实际的正则表达式,您的
+
位于错误的位置。您需要将其移入小组中。我使用 regex101.com 进行测试...string.string.split()
示例:示例:You could do this a lot simpler, by just using
string.split()
and setting themaxsplit
to 1. See the example below.Note that regex is very useful, but it's very easy to get incorrect results with it. I advise to use a online regex validator if you really need to use it. As for the actual regex, your
+
is in the wrong place. You need move it inside the group. I used regex101.com for testing...string.split()
example: