通过另一列对大熊猫列进行分类
我的数据看起来像这样:
我正在使用以下脚本将RP8_recruise填充为“ y”(y y”(近_dist< 100 meters< 100 meters)或“ n”(接近_dist> 100米)。
nrows = plots_dist_joined.shape[0]
for i in range(0, nrows):
# for plots that are within wanted distance from disturbance harvest
if (plots_dist_joined.iloc[i,9] < 100) | (plots_dist_joined.iloc[i,9] == 100):
plots_dist_joined["RP_"+reporting_period+"Recruise"] = "Y"
plots_dist_joined["RP_"+reporting_period+"RecrType"] = "PD"
# for plots that are NOT within wanted distance from disturbance harvest
else:
plots_dist_joined["RP_"+reporting_period+"Recruise"] = "N"
plots_dist_joined["RP_"+reporting_period+"RecrType"] = np.nan
即使有100米以下的距离(IDS = 59197、40、84、92、132),这将整个RP_8RECRUISE列填充为“ N”。我不确定代码中有什么问题。
My data looks like this:
I'm using the following script to populate the RP8_Recruise as either "Y" (NEAR_DIST< 100 meters) or "N" (NEAR_DIST> 100 meters).
nrows = plots_dist_joined.shape[0]
for i in range(0, nrows):
# for plots that are within wanted distance from disturbance harvest
if (plots_dist_joined.iloc[i,9] < 100) | (plots_dist_joined.iloc[i,9] == 100):
plots_dist_joined["RP_"+reporting_period+"Recruise"] = "Y"
plots_dist_joined["RP_"+reporting_period+"RecrType"] = "PD"
# for plots that are NOT within wanted distance from disturbance harvest
else:
plots_dist_joined["RP_"+reporting_period+"Recruise"] = "N"
plots_dist_joined["RP_"+reporting_period+"RecrType"] = np.nan
This populates the entire RP_8Recruise column as "N" even though there are distances that are under 100 meters (IDs = 59197, 40, 84, 92, 132). I'm not sure what is wrong in the code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
代码的问题在于,在每次迭代中,将为整个
rp_8recruise
和rp_8recrtype
列分配一个新值。这些列的最终值是由最后一行中的df.near_dist
值决定的。而不是循环使用vectorized
numpy.where()
填充值的方法The problem with your code is that in each iteration, a new value is being assigned to the entire
RP_8Recruise
andRP_8RecrType
columns. The final values of these columns are being decided by thedf.NEAR_DIST
value in the last row.Instead of a for-loop use vectorized
numpy.where()
method to fill in values