通过另一列对大熊猫列进行分类

发布于 2025-02-11 00:52:58 字数 911 浏览 0 评论 0原文

我的数据看起来像这样: “

我正在使用以下脚本将RP8_recruise填充为“ y”(y y”(近_dist< 100 meters< 100 meters)或“ n”(接近_dist> 100米)。

nrows = plots_dist_joined.shape[0]

for i in range(0, nrows):
    
    # for plots that are within wanted distance from disturbance harvest 
    if (plots_dist_joined.iloc[i,9] < 100) | (plots_dist_joined.iloc[i,9] == 100):
        plots_dist_joined["RP_"+reporting_period+"Recruise"] = "Y"
        plots_dist_joined["RP_"+reporting_period+"RecrType"] = "PD"
    
    # for plots that are NOT within wanted distance from disturbance harvest 
    else:
        plots_dist_joined["RP_"+reporting_period+"Recruise"] = "N"
        plots_dist_joined["RP_"+reporting_period+"RecrType"] = np.nan

即使有100米以下的距离(IDS = 59197、40、84、92、132),这将整个RP_8RECRUISE列填充为“ N”。我不确定代码中有什么问题。

My data looks like this:
screenshot of data

I'm using the following script to populate the RP8_Recruise as either "Y" (NEAR_DIST< 100 meters) or "N" (NEAR_DIST> 100 meters).

nrows = plots_dist_joined.shape[0]

for i in range(0, nrows):
    
    # for plots that are within wanted distance from disturbance harvest 
    if (plots_dist_joined.iloc[i,9] < 100) | (plots_dist_joined.iloc[i,9] == 100):
        plots_dist_joined["RP_"+reporting_period+"Recruise"] = "Y"
        plots_dist_joined["RP_"+reporting_period+"RecrType"] = "PD"
    
    # for plots that are NOT within wanted distance from disturbance harvest 
    else:
        plots_dist_joined["RP_"+reporting_period+"Recruise"] = "N"
        plots_dist_joined["RP_"+reporting_period+"RecrType"] = np.nan

This populates the entire RP_8Recruise column as "N" even though there are distances that are under 100 meters (IDs = 59197, 40, 84, 92, 132). I'm not sure what is wrong in the code.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

纵情客 2025-02-18 00:52:58

代码的问题在于,在每次迭代中,将为整个rp_8recruiserp_8recrtype列分配一个新值。这些列的最终值是由最后一行中的df.near_dist值决定的。

而不是循环使用vectorized numpy.where()填充值的方法

# a mask that checks if it's near
is_near = df.NEAR_DIST <= 100
# if near, Y, else N
plots_dist_joined["RP_8Recruise"] = np.where(is_near, "Y", "N")
# if near, PD, else NaN
plots_dist_joined["RP_8RecrType"] = np.where(is_near, "PD", np.nan)

The problem with your code is that in each iteration, a new value is being assigned to the entire RP_8Recruise and RP_8RecrType columns. The final values of these columns are being decided by the df.NEAR_DIST value in the last row.

Instead of a for-loop use vectorized numpy.where() method to fill in values

# a mask that checks if it's near
is_near = df.NEAR_DIST <= 100
# if near, Y, else N
plots_dist_joined["RP_8Recruise"] = np.where(is_near, "Y", "N")
# if near, PD, else NaN
plots_dist_joined["RP_8RecrType"] = np.where(is_near, "PD", np.nan)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文