从另一列填充丢失值
lotfrontage列与lotarea有关系 LotFrontage的值在0.005%-0.01%的lotarea之间。
我试图在缺少LotFrontage的Lotarea的0.005%-0.01%之间获取随机值。
示例:在PIC中,lotFrontage缺少1019个索引值。我想用lotarea值8978 * 0.005至8978 * 0.01
代码(解决此问题):
np.where(df_train[df_train["LotFrontage"].isnull()], np.random.rand(df_train['LotArea']*0.005, df_train["LotArea"]*0.01),df_train["LotFrontage"])
Error:
TypeError Traceback (most recent call last)
<ipython-input-46-49a940deebcd> in <module>()
----> 1 np.random.rand(df_train['LotArea'] *0.005,df_train["LotArea"] * 0.01)
mtrand.pyx in numpy.random.mtrand.RandomState.rand()
mtrand.pyx in numpy.random.mtrand.RandomState.random_sample()
_common.pyx in numpy.random._common.double_fill()
TypeError: 'Series' object cannot be interpreted as an integer
LotFrontage column have relationship with LotArea
the values of LotFrontage is between 0.005% - 0.01% of the LotArea.
I am trying to get the random values between 0.005% - 0.01% of LotArea where LotFrontage is missing.
Example: In the pic at 1019 index values is missing for LotFrontage. I want to fill it with LotArea value 8978 * 0.005 to 8978 * 0.01
Code(to solve this issue):
np.where(df_train[df_train["LotFrontage"].isnull()], np.random.rand(df_train['LotArea']*0.005, df_train["LotArea"]*0.01),df_train["LotFrontage"])
Error:
TypeError Traceback (most recent call last)
<ipython-input-46-49a940deebcd> in <module>()
----> 1 np.random.rand(df_train['LotArea'] *0.005,df_train["LotArea"] * 0.01)
mtrand.pyx in numpy.random.mtrand.RandomState.rand()
mtrand.pyx in numpy.random.mtrand.RandomState.random_sample()
_common.pyx in numpy.random._common.double_fill()
TypeError: 'Series' object cannot be interpreted as an integer
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这种方法怎么样?
比,我们改变了这一点:
对此:
How about this approach?
Than, we transform this:
To this: