根据行值获取值和列名称 - 具有范围的列
我有这个数据框
df = pd.DataFrame( {'R': {0: '01', 1: '02', 2: '03', 3: '04', 4: '05', 5: '06', 6: '07'}, 'name': {0: 'b', 1: 'm', 2: '', 3: '', 4: 'b', 5: 'mi,b,m,c', 6: 'mi,e,w,c'}, 'value': {0: ['5.01e-13'], 1: ['9.74e-32'], 2: np.nan, 3: np.nan, 4: ['8.58e-09'], 5: ['1.04e-01', '1.18e-01', '7.19e-08', '1.06e-01'], 6: ['2.64e-01', '3.05e-01', '1.77e-01', '2.28e-01']}, } )
,它产生:
R name value
0 01 b [5.01e-13]
1 02 m [9.74e-32]
2 03 NaN
3 04 NaN
4 05 b [8.58e-09]
5 06 mi,b,m,c [1.04e-01, 1.18e-01, 7.19e-08, 1.06e-01]
6 07 mi,e,w,c [2.64e-01, 3.05e-01, 1.77e-01, 2.28e-01]
我需要 2 个新列
df['name2']= 显示 df['name'] 中 df['value'] < 的名称0.05
df['value2']= 显示 df['value'] 中的值
0.05
以下是所需的输出:
R name value name2 value2
0 01 b [5.01e-13] b [5.01e-13]
1 02 m [9.74e-32] m [9.74e-32]
2 03 NaN
3 04 NaN
4 05 b [8.58e-09] b [8.58e-09]
5 06 mi,b,m,c [1.04e-01, 1.18e-01, 7.19e-08, 1.06e-01] m [7.19e-08]
6 07 mi,e,w,c [2.64e-01, 3.05e-01, 1.77e-01, 2.28e-01]
我尝试了几个选项,例如
df['name2']=np.where[(df['value']<0.05), df['name'],'']
或 这个答案,但不幸的是它不起作用。
I have this dataframe
df = pd.DataFrame( {'R': {0: '01', 1: '02', 2: '03', 3: '04', 4: '05', 5: '06', 6: '07'}, 'name': {0: 'b', 1: 'm', 2: '', 3: '', 4: 'b', 5: 'mi,b,m,c', 6: 'mi,e,w,c'}, 'value': {0: ['5.01e-13'], 1: ['9.74e-32'], 2: np.nan, 3: np.nan, 4: ['8.58e-09'], 5: ['1.04e-01', '1.18e-01', '7.19e-08', '1.06e-01'], 6: ['2.64e-01', '3.05e-01', '1.77e-01', '2.28e-01']}, } )
which yields to:
R name value
0 01 b [5.01e-13]
1 02 m [9.74e-32]
2 03 NaN
3 04 NaN
4 05 b [8.58e-09]
5 06 mi,b,m,c [1.04e-01, 1.18e-01, 7.19e-08, 1.06e-01]
6 07 mi,e,w,c [2.64e-01, 3.05e-01, 1.77e-01, 2.28e-01]
I need 2 new columns
df['name2']= displays name from df['name'] that has df['value'] < 0.05
df['value2']= displays value from df['value'] that is < 0.05
The following is the desired output:
R name value name2 value2
0 01 b [5.01e-13] b [5.01e-13]
1 02 m [9.74e-32] m [9.74e-32]
2 03 NaN
3 04 NaN
4 05 b [8.58e-09] b [8.58e-09]
5 06 mi,b,m,c [1.04e-01, 1.18e-01, 7.19e-08, 1.06e-01] m [7.19e-08]
6 07 mi,e,w,c [2.64e-01, 3.05e-01, 1.77e-01, 2.28e-01]
I tried several options such as
df['name2']=np.where[(df['value']<0.05), df['name'],'']
or code resulting from this answer, but unfortuantely it did not work.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Pandas 方法:
Split
、explode
然后过滤值为的行.05
,按level=0
对过滤后的行进行分组,并使用join
进行聚合。注意:通常不建议在数据帧中存储复杂的数据类型(如列表、字典),除非有非常充分的理由。这会严重影响性能。
Pandas approach:
Split
,explode
then filter the rows where value is< .05
, group the filtered rows bylevel=0
and aggregate usingjoin
.Note: It is generally not advisable to store complex datatypes (like lists, dicts) in dataframes unless you have a very strong reason. This will affect the performance terribly.
首先,您需要通过拆分
,
字符将name
列从字符串转换为字符串数组。现在,您可以简单地应用另一个 lambda 函数来获取
name2
列所需的输出。输出
Firstly you need to convert the
name
column from a string to array of string by splitting on the,
character.Now you can simply apply another lambda function to get the desired output for
name2
column.Output