熊猫散点图由多行条件和x,y列值滤波
感谢您的想法 - 我一直在尝试使用循环进行散点图,以过滤X值(列数据)和Y值(列数据)的唯一行值(2)行值。当满足2行条件时,将制作散点图的列数据。我的数据看起来像这样:
site_name power_1 wind_speed month year day hour power_2
A 50 5.5 1 2021 2 5 60
A 75 5.9 2 2021 8 17 70
A 40 7.3 5 2021 11 20 85
B 80 8.1 4 2021 1 4 90
B 84 8.2 7 2021 18 5 92
B 46 6.1 10 2021 23 11 41
我试图在带有x =风速的单独散点图中绘制每个站点,y = power_1,每个小时都有不同的颜色。最终,我需要2个散点图(a,b)才能进行风速和功率,然后为x,y值3个不同的颜色点。我希望这是有道理的。
我尝试使用2循环结构-1个位点(a,b)的外循环和x,y值的颜色的内环。
我的实际代码比上面显示的大得多的数据集类似于下面,当我使用此数据时,我会得到一个空白图:
#PLOT ALL HOURS OF THE MONTHS/YEARS - WIND SPEED vs POWER
sites = (dfc1.plant_name.unique())
sites = sites.tolist()
import matplotlib.patches
from scipy.interpolate import interp1d
levels, categories = pd.factorize(dfc1.hour.unique())
colors = [plt.cm.Paired(k) for k in levels]
handles = [matplotlib.patches.Patch(color=plt.cm.Paired(k), label=c) for k, c in enumerate(categories)]
#fig, ax = plt.subplots(figsize=(10,4))
for i in range(len(sites)):
#fig = plt.figure()
for j in np.arange(0,24): #24 HOURS AND 1 COLOR FOR EACH UNIQUE HOUR
x = dfc1.loc[dfc1['plant_name']==sites[i]].groupby(['hour']).wind_speed_ms
y = dfc1.loc[dfc1['plant_name']==sites[i]].groupby(['hour']).power_kwh
plt.scatter(x,y, edgecolors=colors[0:j],marker='o',facecolors='none')
site = str(sites[i])
plt.title(site + (' ') + str(dfc1.columns[5]) + (' ') + ('vs') + (' ') + str(dfc1.columns[3]) )
plt.xlabel('Wind Speed'); plt.ylabel('Power')
plt.legend(handles=handles, title="Month",loc='center left', bbox_to_anchor=(1,0.5),edgecolor='black')
#plt.plot(mwsvar.iloc[-1,4], mpvar.iloc[-1,4], c='orange',linestyle=(0,()),marker="o",markersize=7)
plt.legend()
plt.show()
Thank you for your ideas - I have been trying to make a scatter plot using a loop to filter for unique (2) row values for x values (column data) and y values (column data). The column data for the scatter plot is made when the 2 row conditions are met. My data looks like this:
site_name power_1 wind_speed month year day hour power_2
A 50 5.5 1 2021 2 5 60
A 75 5.9 2 2021 8 17 70
A 40 7.3 5 2021 11 20 85
B 80 8.1 4 2021 1 4 90
B 84 8.2 7 2021 18 5 92
B 46 6.1 10 2021 23 11 41
I am trying to plot each site in a separate scatter plot with x = wind speed and y = power_1 and each hour a different color. Ultimately, I need 2 scatter plots (A, B) for wind speed and power and then 3 different color points for the x, y values. I hope this makes sense.
I have tried using a 2-loop structure - 1 outer loop for the sites (A, B) and an inner loop for the colors of the x, y values.
My actual code to a much larger dataset than I show above resembles below and I get a blank plot when I use this:
#PLOT ALL HOURS OF THE MONTHS/YEARS - WIND SPEED vs POWER
sites = (dfc1.plant_name.unique())
sites = sites.tolist()
import matplotlib.patches
from scipy.interpolate import interp1d
levels, categories = pd.factorize(dfc1.hour.unique())
colors = [plt.cm.Paired(k) for k in levels]
handles = [matplotlib.patches.Patch(color=plt.cm.Paired(k), label=c) for k, c in enumerate(categories)]
#fig, ax = plt.subplots(figsize=(10,4))
for i in range(len(sites)):
#fig = plt.figure()
for j in np.arange(0,24): #24 HOURS AND 1 COLOR FOR EACH UNIQUE HOUR
x = dfc1.loc[dfc1['plant_name']==sites[i]].groupby(['hour']).wind_speed_ms
y = dfc1.loc[dfc1['plant_name']==sites[i]].groupby(['hour']).power_kwh
plt.scatter(x,y, edgecolors=colors[0:j],marker='o',facecolors='none')
site = str(sites[i])
plt.title(site + (' ') + str(dfc1.columns[5]) + (' ') + ('vs') + (' ') + str(dfc1.columns[3]) )
plt.xlabel('Wind Speed'); plt.ylabel('Power')
plt.legend(handles=handles, title="Month",loc='center left', bbox_to_anchor=(1,0.5),edgecolor='black')
#plt.plot(mwsvar.iloc[-1,4], mpvar.iloc[-1,4], c='orange',linestyle=(0,()),marker="o",markersize=7)
plt.legend()
plt.show()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您非常接近,这是使用Matplotlib的解决方案,它有点笨拙,但我认为这是正确的解决方案。然后,我还使用一个称为Seaborn的不同库显示,该图可以使类似地块变得更加容易
I think you're very close, here's a solution using matplotlib which is kind of long and unwieldy but I think it's the correct solution. Then I also show using a different library called seaborn which makes plots like this much easier