从大熊猫数据框架中可视化最常见的术语的WordCloud
我有一个像这样的大熊猫数据框的子集:
df = pd.DataFrame({index: [0,1,2]}, {viz_words: ['palace boat painting reel', ['paintings painting gallery'], ['biscuits cake gallery cafe']}
<OUT> df
index viz_words
0 palace boat painting reel
1 paintings painting gallery
2 biscuits cake gallery cafe
我试图从viz_words
列中获取所有单词,并创建一个可视化的较高频率术语的WordCloud,因为较大和较小的频率项较小。
到目前为止,我有这个问题,但是问题是它仅在viz_words
之间可视化“常见”术语,这意味着它在很多术语中都错过了。
dictionary_small_scale = dict([tuple(x) for x in df.viz_words.str.split(expand=True).stack().value_counts().\
reset_index().values])
#generate wordcloud and plot using matplotlib
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(dictionary_small_scale)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
I have a subset of a pandas dataframe like so:
df = pd.DataFrame({index: [0,1,2]}, {viz_words: ['palace boat painting reel', ['paintings painting gallery'], ['biscuits cake gallery cafe']}
<OUT> df
index viz_words
0 palace boat painting reel
1 paintings painting gallery
2 biscuits cake gallery cafe
I am trying to take all words from viz_words
column and create a wordcloud that visualises higher-frequency terms larger as larger and smaller-frequency terms smaller.
I have this so far, but the issue is that it is only visualising the "common" terms among viz_words
which means it misses out on a lot of terms.
dictionary_small_scale = dict([tuple(x) for x in df.viz_words.str.split(expand=True).stack().value_counts().\
reset_index().values])
#generate wordcloud and plot using matplotlib
wordcloud = WordCloud()
wordcloud.generate_from_frequencies(dictionary_small_scale)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论