如何在熊猫中正确地将柱子列为列?

发布于 2025-01-25 15:58:27 字数 553 浏览 3 评论 0原文

我正在尝试通过社交媒体的评论来解决数据集中的令牌化问题。我想从熊猫列中象征性,诱饵,删除标点和停车词。我正在为每个评论做如何做。试图获取令牌时,我会收到以下错误:

import pandas as pd
import nltk
...
merged['message_tokens'] = merged.apply(lambda x: nltk.tokenize.word_tokenize(x['Clean_message']), axis=1)

TypeError: expected string or bytes-like object

当我试图告诉Pandas我将其传递给字符串对象时,它会给我以下错误消息:

merged['message_tokens'] = merged.apply(lambda x: nltk.tokenize.word_tokenize(x['Clean_message'].str), axis=1)

AttributeError: 'str' object has no attribute 'str'

我在做什么错?

I am trying to solve tokenization problem in my dataset with comments from social media. I want to tokenize, lemmatize, remove punctuations and stop-words from the pandas column. I am struggling how to do it for each of the comment. I receive the following error when trying to get tokens:

import pandas as pd
import nltk
...
merged['message_tokens'] = merged.apply(lambda x: nltk.tokenize.word_tokenize(x['Clean_message']), axis=1)

TypeError: expected string or bytes-like object

When I am trying to tell pandas that I am passing it a string object, it gives me the following error message:

merged['message_tokens'] = merged.apply(lambda x: nltk.tokenize.word_tokenize(x['Clean_message'].str), axis=1)

AttributeError: 'str' object has no attribute 'str'

What am I doing wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

泪眸﹌ 2025-02-01 15:58:28

您可以使用astype将列类型迫使字符串,

merged['Clean_message'] = merged['Clean_message'].astype(str)

如果您想查看原始列中的错误,则可以使用

m = merged['Clean_message'].apply(type).ne(str)
out = merged[m]

out dataFrame包含<< dataframe dataframe。代码> clean_message 列不是字符串。

You can use astype to force the column type to string

merged['Clean_message'] = merged['Clean_message'].astype(str)

If you want to look at what's wrong in original column, you can use

m = merged['Clean_message'].apply(type).ne(str)
out = merged[m]

out dataframe contains the rows where the type of Clean_message column is not string.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文