计数句子中的单词频率

发布于 2025-01-25 01:27:53 字数 1351 浏览 1 评论 0原文

我有两个列 - 一个带有句子，另一个带有单词。

句子	单词
“这样的一天！这是美好的一天”，	“美丽的
一天！这是美好的一天！在那里是美好的一天”，	“天”
，“我对悲伤的天气感到难过”，	“天气”
“我对悲伤的天气”	“悲伤”

我想计算“句子”列中“词”列的频率并实现此输出：

句子	“	n
“这样的一天！这是美好的一天”，	美丽”	“
一天	1	如此
。	美好的一天！这是美好的天气“	1
”我对悲伤的天气感到难过”	“悲伤”	2

我尝试了：

ok = []
for l in [x.split() for x in df['Sentence']]:
    for y in df['word']:
        ok.append(l.count(y))

但是它不会停止运行，并且需要很长时间，因此对于我的实际数据集而言，由于它具有50k行，因此不可行。

有人可以帮助实现这一目标吗？

原文

I have two columns - one with sentences and the other with single words.

Sentence	word
"Such a day! It's a beautiful day out there"	"beautiful"
"Such a day! It's a beautiful day out there"	"day"
"I am sad by the sad weather"	"weather"
"I am sad by the sad weather"	"sad"

I want to count the frequency of the "word" column in the "sentence" column
and achieve this output:

Sentence	word	n
"Such a day! It's a beautiful day out there"	"beautiful"	1
"Such a day! It's a beautiful day out there"	"day"	2
"I am sad by the sad weather"	"weather"	1
"I am sad by the sad weather"	"sad"	2

I tried:

ok = []
for l in [x.split() for x in df['Sentence']]:
    for y in df['word']:
        ok.append(l.count(y))

However it does NOT stop running and takes A VERY long time, so is not feasible for my actual dataset as it has 50k rows.

Anyone can help to achieve this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我三岁 2025-02-01 01:27:53

您可以使用ZIP进行操作

df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]: 
                                     Sentence       word  new
0  Such a day! It's a beautiful day out there  beautiful    1
1  Such a day! It's a beautiful day out there        day    2
2                 I am sad by the sad weather    weather    1
3                 I am sad by the sad weather        sad    2

You can do it with zip

df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]: 
                                     Sentence       word  new
0  Such a day! It's a beautiful day out there  beautiful    1
1  Such a day! It's a beautiful day out there        day    2
2                 I am sad by the sad weather    weather    1
3                 I am sad by the sad weather        sad    2

回复收藏 0 原文

痞味浪人 2025-02-01 01:27:53

尝试使用pandas.apply：

df['n'] = df.apply(lambda r: r['Sentence'].count(r['word']), axis=1)

结果：

                                     Sentence       word  n
0  Such a day! It's a beautiful day out there  beautiful  1
1  Such a day! It's a beautiful day out there        day  2
2                 I am sad by the sad weather    weather  1
3                 I am sad by the sad weather        sad  2

Try using pandas.apply:

df['n'] = df.apply(lambda r: r['Sentence'].count(r['word']), axis=1)

Result:

                                     Sentence       word  n
0  Such a day! It's a beautiful day out there  beautiful  1
1  Such a day! It's a beautiful day out there        day  2
2                 I am sad by the sad weather    weather  1
3                 I am sad by the sad weather        sad  2

回复收藏 0 原文

我为君王 2025-02-01 01:27:53

您可以使用以下代码计数字符串中的字符串

# define string
string = "This is how you count same word of your defined string to another string using python"
substring = "string"

count = string.count(substring)

# print count
print(f"The count of the word {substring} is:", count)

输出：
字符串一词的计数是：2

You can count string in a string using below code

# define string
string = "This is how you count same word of your defined string to another string using python"
substring = "string"

count = string.count(substring)

# print count
print(f"The count of the word {substring} is:", count)

Output:
The count of the word string is: 2

回复收藏 0 原文

~没有更多了~