如何按条件从pandas的Python的另一列中从一个列中删除一部分值?
我有房地产数据框,其中包含许多异常值和许多观察结果。 我有变量:总面积、房间数量(如果房间 = 0,则为单间公寓)和厨房区域。
从我的数据框中“最小化”提取:
dic = [{'area': 40, 'kitchen_area': 10, 'rooms': 1, 'price': 50000 },
{'area': 20, 'kitchen_area': 0, 'rooms': 0, 'price': 50000},
{'area': 60, 'kitchen_area': 0, 'rooms': 2, 'price': 70000},
{'area': 29, 'kitchen_area': 9, 'rooms': 1, 'price': 30000},
{'area': 15, 'kitchen_area': 0, 'rooms': 0, 'price': 25000}]
df = pd.DataFrame(dic, index=['apt1', 'apt2','apt3','apt4', 'apt5'])
我的目标是消除 apt3,因为根据法律,非单间公寓的厨房面积不能小于 5 平方米。 换句话说,我想从数据框中删除包含非工作室公寓(rooms
>0)但具有kitchen_area
<的公寓数据的所有行。 5
我尝试过这样的代码:
df1 = df.drop(df[(df.rooms > 0) & (df.kitchen_area < 5)].index)
但它只是根据我输入的多个条件消除了 kitchen_area
和 rooms
列中的所有数据。
I have real estate dataframe with many outliers and many observations.
I have variables: total area
, number of rooms
(if rooms = 0, then it's studio appartment) and kitchen_area
.
"Minimalized" extraction from my dataframe:
dic = [{'area': 40, 'kitchen_area': 10, 'rooms': 1, 'price': 50000 },
{'area': 20, 'kitchen_area': 0, 'rooms': 0, 'price': 50000},
{'area': 60, 'kitchen_area': 0, 'rooms': 2, 'price': 70000},
{'area': 29, 'kitchen_area': 9, 'rooms': 1, 'price': 30000},
{'area': 15, 'kitchen_area': 0, 'rooms': 0, 'price': 25000}]
df = pd.DataFrame(dic, index=['apt1', 'apt2','apt3','apt4', 'apt5'])
My target would be to eliminate apt3, because by law, kitchen area cannot be smaller than 5 squared meters in non-studio apartments.
In other words, I would like to eliminate all rows from my dataframe containing the data about apartments which are non-studio (rooms
>0), but have kitchen_area
<5
I have tried code like this:
df1 = df.drop(df[(df.rooms > 0) & (df.kitchen_area < 5)].index)
But it just eliminated all data from both columns kitchen_area
and rooms
according to the multiple conditions I put.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
清洁
pd.dataframe.query
Clean
pd.DataFRame.query