计数句子中的单词频率

发布于 2025-01-25 01:27:53 字数 1351 浏览 1 评论 0原文

我有两个列 - 一个带有句子,另一个带有单词。

句子单词
“这样的一天!这是美好的一天”,“美丽的
一天!这是美好的一天!在那里是美好的一天”,“天”
,“我对悲伤的天气感到难过”,“天气”
“我对悲伤的天气”“悲伤”

我想计算“句子”列中“词”列的频率 并实现此输出:

句子n
“这样的一天!这是美好的一天”,美丽”
一天1如此
美好的一天!这是美好的 天气“1
”我对悲伤的天气感到难过”“悲伤”2

我尝试了:

ok = []
for l in [x.split() for x in df['Sentence']]:
    for y in df['word']:
        ok.append(l.count(y))

但是它不会停止运行,并且需要很长时间,因此对于我的实际数据集而言,由于它具有50k行,因此不可行。

有人可以帮助实现这一目标吗?

I have two columns - one with sentences and the other with single words.

Sentenceword
"Such a day! It's a beautiful day out there""beautiful"
"Such a day! It's a beautiful day out there""day"
"I am sad by the sad weather""weather"
"I am sad by the sad weather""sad"

I want to count the frequency of the "word" column in the "sentence" column
and achieve this output:

Sentencewordn
"Such a day! It's a beautiful day out there""beautiful"1
"Such a day! It's a beautiful day out there""day"2
"I am sad by the sad weather""weather"1
"I am sad by the sad weather""sad"2

I tried:

ok = []
for l in [x.split() for x in df['Sentence']]:
    for y in df['word']:
        ok.append(l.count(y))

However it does NOT stop running and takes A VERY long time, so is not feasible for my actual dataset as it has 50k rows.

Anyone can help to achieve this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

我三岁 2025-02-01 01:27:53

您可以使用ZIP进行操作

df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]: 
                                     Sentence       word  new
0  Such a day! It's a beautiful day out there  beautiful    1
1  Such a day! It's a beautiful day out there        day    2
2                 I am sad by the sad weather    weather    1
3                 I am sad by the sad weather        sad    2

You can do it with zip

df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]: 
                                     Sentence       word  new
0  Such a day! It's a beautiful day out there  beautiful    1
1  Such a day! It's a beautiful day out there        day    2
2                 I am sad by the sad weather    weather    1
3                 I am sad by the sad weather        sad    2
痞味浪人 2025-02-01 01:27:53

尝试使用pandas.apply

df['n'] = df.apply(lambda r: r['Sentence'].count(r['word']), axis=1)

结果:

                                     Sentence       word  n
0  Such a day! It's a beautiful day out there  beautiful  1
1  Such a day! It's a beautiful day out there        day  2
2                 I am sad by the sad weather    weather  1
3                 I am sad by the sad weather        sad  2

Try using pandas.apply:

df['n'] = df.apply(lambda r: r['Sentence'].count(r['word']), axis=1)

Result:

                                     Sentence       word  n
0  Such a day! It's a beautiful day out there  beautiful  1
1  Such a day! It's a beautiful day out there        day  2
2                 I am sad by the sad weather    weather  1
3                 I am sad by the sad weather        sad  2
我为君王 2025-02-01 01:27:53

您可以使用以下代码计数字符串中的字符串

# define string
string = "This is how you count same word of your defined string to another string using python"
substring = "string"

count = string.count(substring)

# print count
print(f"The count of the word {substring} is:", count)

输出:
字符串一词的计数是:2

You can count string in a string using below code

# define string
string = "This is how you count same word of your defined string to another string using python"
substring = "string"

count = string.count(substring)

# print count
print(f"The count of the word {substring} is:", count)

Output:
The count of the word string is: 2

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文