将值提取到另一列中每个唯一值的新列中

发布于 2025-02-08 08:02:23 字数 1465 浏览 1 评论 0原文

我有一个dataframe和一个样本看起来像

review_id   ngram   date    rating          attraction   indo
4           bigram  2021        10          uss          sangat lengkap
359         bigram  2019        10          uss          sangat lengkap
911         bigram  2018        10          uss          sangat lengkap
977         bigram  2018        10          uss          sangat lengkap
1062        bigram  2019        10          uss          agak bingung
2919        bigram  2019        9           uss          agak bingung
3531        bigram  2018        10          uss          sangat lengkap
4282        bigram  2019        10          sea_aquarium sangat lengkap

我尝试了以下代码,但是它不起作用,因为它返回了所有计数的评论_id,而这些计数是Indo列中可能与可能不是相同单词的所有计数。

df_sentiment['count'] = df_sentiment['indo'].value_counts()

def get_all_review_id():
    all_review_id = []
    for i in range(len(df_sentiment)):

        if df_sentiment['count'][i] > 1:
            all_review_id.append(df_sentiment['review_id'][i])
    return all_review_id

df_sentiment["all_review_id"] = df_sentiment['indo'].progress_apply(lambda x: get_all_review_id(x))

我可以使用任何建议 /代码吗? 谢谢你!

I have a dataframe and a sample of it looks like this

review_id   ngram   date    rating          attraction   indo
4           bigram  2021        10          uss          sangat lengkap
359         bigram  2019        10          uss          sangat lengkap
911         bigram  2018        10          uss          sangat lengkap
977         bigram  2018        10          uss          sangat lengkap
1062        bigram  2019        10          uss          agak bingung
2919        bigram  2019        9           uss          agak bingung
3531        bigram  2018        10          uss          sangat lengkap
4282        bigram  2019        10          sea_aquarium sangat lengkap

df

I would like to extract the review_id into a list for each word in indo column such that the output would be something like this
output2

I tried the following code but it does not work as it returns the review_id of all counts that are more that one which may or may not be the same words in the indo column.

df_sentiment['count'] = df_sentiment['indo'].value_counts()

def get_all_review_id():
    all_review_id = []
    for i in range(len(df_sentiment)):

        if df_sentiment['count'][i] > 1:
            all_review_id.append(df_sentiment['review_id'][i])
    return all_review_id

df_sentiment["all_review_id"] = df_sentiment['indo'].progress_apply(lambda x: get_all_review_id(x))

Any suggestions / code I can use?
Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

月隐月明月朦胧 2025-02-15 08:02:24

如果您共享 data ,我可以复制并添加结果

,希望可以回答您的问题

df.groupby(['ngram','date','rating','attraction','indo'])['review_id'].agg(list).reset_index()
    ngram   date    rating  attraction   indo               review_id
0   bigram  2018    10      uss          sangat lengkap     [911, 977, 3531]
1   bigram  2019    9       uss          agak bingung       [2919]
2   bigram  2019    10      sea_aquarium sangat blengkap    [4282]
3   bigram  2019    10      uss          agak bingung       [1062]
4   bigram  2019    10      uss          sangat lengkap     [359]
5   bigram  2021    10      uss          sangat lengkap     [4]

if you share the data, I can reproduce and add the result

This hopefully will answer your question

df.groupby(['ngram','date','rating','attraction','indo'])['review_id'].agg(list).reset_index()
    ngram   date    rating  attraction   indo               review_id
0   bigram  2018    10      uss          sangat lengkap     [911, 977, 3531]
1   bigram  2019    9       uss          agak bingung       [2919]
2   bigram  2019    10      sea_aquarium sangat blengkap    [4282]
3   bigram  2019    10      uss          agak bingung       [1062]
4   bigram  2019    10      uss          sangat lengkap     [359]
5   bigram  2021    10      uss          sangat lengkap     [4]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文