如何按条件从pandas的Python的另一列中从一个列中删除一部分值?

发布于 2025-01-18 05:17:17 字数 845 浏览 1 评论 0原文

我有房地产数据框,其中包含许多异常值和许多观察结果。 我有变量:总面积、房间数量(如果房间 = 0,则为单间公寓)和厨房区域。

从我的数据框中“最小化”提取:

dic = [{'area': 40, 'kitchen_area': 10, 'rooms': 1, 'price': 50000 },
 {'area': 20, 'kitchen_area': 0, 'rooms': 0, 'price': 50000},
 {'area': 60, 'kitchen_area': 0, 'rooms': 2, 'price': 70000},
 {'area': 29, 'kitchen_area': 9, 'rooms': 1, 'price': 30000},
 {'area': 15, 'kitchen_area': 0, 'rooms': 0, 'price': 25000}]
df = pd.DataFrame(dic, index=['apt1', 'apt2','apt3','apt4', 'apt5'])

我的目标是消除 apt3,因为根据法律,非单间公寓的厨房面积不能小于 5 平方米。 换句话说,我想从数据框中删除包含非工作室公寓(rooms>0)但具有kitchen_area <的公寓数据的所有行。 5

我尝试过这样的代码:

df1 = df.drop(df[(df.rooms > 0) & (df.kitchen_area < 5)].index)

但它只是根据我输入的多个条件消除了 kitchen_arearooms 列中的所有数据。

I have real estate dataframe with many outliers and many observations.
I have variables: total area, number of rooms (if rooms = 0, then it's studio appartment) and kitchen_area.

"Minimalized" extraction from my dataframe:

dic = [{'area': 40, 'kitchen_area': 10, 'rooms': 1, 'price': 50000 },
 {'area': 20, 'kitchen_area': 0, 'rooms': 0, 'price': 50000},
 {'area': 60, 'kitchen_area': 0, 'rooms': 2, 'price': 70000},
 {'area': 29, 'kitchen_area': 9, 'rooms': 1, 'price': 30000},
 {'area': 15, 'kitchen_area': 0, 'rooms': 0, 'price': 25000}]
df = pd.DataFrame(dic, index=['apt1', 'apt2','apt3','apt4', 'apt5'])

My target would be to eliminate apt3, because by law, kitchen area cannot be smaller than 5 squared meters in non-studio apartments.
In other words, I would like to eliminate all rows from my dataframe containing the data about apartments which are non-studio (rooms>0), but have kitchen_area <5

I have tried code like this:

df1 = df.drop(df[(df.rooms > 0) & (df.kitchen_area < 5)].index)

But it just eliminated all data from both columns kitchen_area and rooms according to the multiple conditions I put.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

深海夜未眠 2025-01-25 05:17:17

清洁

mask1 = df.rooms > 0
mask2 = df.kitchen_area < 5

df1 = df[~(mask1 & mask2)]
df1

      area  kitchen_area  rooms  price
apt1    40            10      1  50000
apt2    20             0      0  50000
apt4    29             9      1  30000
apt5    15             0      0  25000

pd.dataframe.query

df1 = df.query('rooms == 0 | kitchen_area >= 5')
df1

      area  kitchen_area  rooms  price
apt1    40            10      1  50000
apt2    20             0      0  50000
apt4    29             9      1  30000
apt5    15             0      0  25000

Clean

mask1 = df.rooms > 0
mask2 = df.kitchen_area < 5

df1 = df[~(mask1 & mask2)]
df1

      area  kitchen_area  rooms  price
apt1    40            10      1  50000
apt2    20             0      0  50000
apt4    29             9      1  30000
apt5    15             0      0  25000

pd.DataFRame.query

df1 = df.query('rooms == 0 | kitchen_area >= 5')
df1

      area  kitchen_area  rooms  price
apt1    40            10      1  50000
apt2    20             0      0  50000
apt4    29             9      1  30000
apt5    15             0      0  25000
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文