如何使用多种条件，包括在Python中选择分位数

发布于 2025-01-31 02:03:19 字数 1469 浏览 4 评论 0原文

想象以下数据集df：

行	群体	距离
1	400	50
2	500	30
3	300	40
4	200	120
5	500	60
6	1000	50 50
7	3300 3300	30
8	500	90 90
90	9700	100
10	1000 110	110 110
11	900	200
12 12	850	30

当df ['perse_density']的值高于第三个位点（＆gt; 75％）和df ['距离时'] is＆lt; 100，而其余数据给出0？因此，第6和7行应该具有1，而其他行应该具有0。

创建一个只有一个标准的虚拟变量很容易。例如，以下条件适用于创建一个新的虚拟变量，该变量在距离为＆lt; 100和0否则时包含1个，否则：df ['dange_below_100'] = np.np.where（df ['danction']＆lt ; 100，1，0）。但是，我不知道如何结合条件，其中一个条件包括分位数选择（在这种情况下，变量puse> supers_dense的上部25％。

import pandas as pd  
  
# assign data of lists.  
data = {'Row': range(1,13,1), 'Population_density': [400, 500, 300, 200, 500, 1000, 3300, 500, 700, 1000, 900, 850],
        'Distance': [50, 30, 40, 120, 60, 50, 30, 90, 100, 110, 200, 30]}  
  
# Create DataFrame  
df = pd.DataFrame(data)

原文

Imagine the following dataset df:

Row	Population_density	Distance
1	400	50
2	500	30
3	300	40
4	200	120
5	500	60
6	1000	50
7	3300	30
8	500	90
9	700	100
10	1000	110
11	900	200
12	850	30

How can I make a new dummy column that represents a 1 when values of df['Population_density'] are above the third quantile (>75%) AND the df['Distance'] is < 100, while a 0 is given to the remainder of the data? Consequently, rows 6 and 7 should have a 1 while the other rows should have a 0.

Creating a dummy variable with only one criterium can be fairly easy. For instance, the following condition works for creating a new dummy variable that contains a 1 when the Distance is <100 and a 0 otherwise: df['Distance_Below_100'] = np.where(df['Distance'] < 100, 1, 0). However, I do not know how to combine conditions whereby one of the conditions includes a quantile selection (in this case, the upper 25% of the variable Population_density.

import pandas as pd  
  
# assign data of lists.  
data = {'Row': range(1,13,1), 'Population_density': [400, 500, 300, 200, 500, 1000, 3300, 500, 700, 1000, 900, 850],
        'Distance': [50, 30, 40, 120, 60, 50, 30, 90, 100, 110, 200, 30]}  
  
# Create DataFrame  
df = pd.DataFrame(data)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

酒绊 2025-02-07 02:03:19

您可以使用＆amp;或|加入条件

import numpy as np

df['Distance_Below_100'] = np.where(df['Population_density'].gt(df['Population_density'].quantile(0.75)) & df['Distance'].lt(100), 1, 0)

print(df)

    Row  Population_density  Distance  Distance_Below_100
0     1                 400        50                   0
1     2                 500        30                   0
2     3                 300        40                   0
3     4                 200       120                   0
4     5                 500        60                   0
5     6                1000        50                   1
6     7                3300        30                   1
7     8                 500        90                   0
8     9                 700       100                   0
9    10                1000       110                   0
10   11                 900       200                   0
11   12                 850        30                   0

You can use & or | to join the conditions

import numpy as np

df['Distance_Below_100'] = np.where(df['Population_density'].gt(df['Population_density'].quantile(0.75)) & df['Distance'].lt(100), 1, 0)

print(df)

    Row  Population_density  Distance  Distance_Below_100
0     1                 400        50                   0
1     2                 500        30                   0
2     3                 300        40                   0
3     4                 200       120                   0
4     5                 500        60                   0
5     6                1000        50                   1
6     7                3300        30                   1
7     8                 500        90                   0
8     9                 700       100                   0
9    10                1000       110                   0
10   11                 900       200                   0
11   12                 850        30                   0

回复收藏 0 原文

悲凉≈ 2025-02-07 02:03:19

为了在数据框架上发挥作用，我建议使用lambda。

例如，这是您的功能：

def myFunction(value):
 pass

创建一个新列“ new_column”，（pick_cell）是您要在哪个函数上创建的单元格：

df['new_column']= df.apply(lambda x : myFunction(x.pick_cell))

he, to make a function on data frame i recommended to use lambda.

for example this is your function:

def myFunction(value):
 pass

to create a new column 'new_column', (pick_cell) is which cell you want to make a function on:

df['new_column']= df.apply(lambda x : myFunction(x.pick_cell))

回复收藏 0 原文

~没有更多了~

关于作者

高速公鹿

暂无简介

文章

29 人气

关注发私信

友情链接

文江博客

如何使用多种条件，包括在Python中选择分位数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

眼泪淡了忧伤

corot39

守护在此方

github_3h15MP3i7

相思故

滥情空心

友情链接

如何使用多种条件，包括在Python中选择分位数

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

眼泪淡了忧伤

corot39

守护在此方

github_3h15MP3i7

相思故

滥情空心

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。