我有一个数据集如下:
ID A1 A2
0 A123 1234
1 1234 5568
2 5568 NaN
3 Zabc NaN
4 3456 3456
5 3456 3456
6 NaN NaN
7 NaN NaN
意图是通过每列(A1和A2),确定两个列在何处空白,如第6和第7行中的位置,创建一个新列,并将其分为“ A1和A2都是空白”
我使用了以下代码:
df['Z_Tax No Not Mapped'] = np.NaN
df['Z_Tax No Not Mapped'] = np.where((df['A1'] == np.NaN) & (df['A2'] == np.NaN), 1, 0)
但是,输出将所有行捕获为新列下的所有行,'z_tax no no dobapped',但是数据具有两个列为空白的实例。不确定我在哪里犯错以过滤此类情况。
注意:A1和A2列有时是字母数字或数字。
想法是将类别放在单独的列中,因为“ ID未更新”或“ ID已更新”,以便通过将简单的过滤器放在“ ID未更新”上,我们可以识别这两个列中空白的案例。
I have a data set as below:
ID A1 A2
0 A123 1234
1 1234 5568
2 5568 NaN
3 Zabc NaN
4 3456 3456
5 3456 3456
6 NaN NaN
7 NaN NaN
Intention is to go through each column (A1 and A2), identify where both the columns are blank as in row 6 and 7, create a new column and categorise as "Both A1 and A2 are blank"
I used the below code:
df['Z_Tax No Not Mapped'] = np.NaN
df['Z_Tax No Not Mapped'] = np.where((df['A1'] == np.NaN) & (df['A2'] == np.NaN), 1, 0)
However the output captures all the rows as 0 under new column 'Z_Tax No Not Mapped', but the data have instances where both the columns are blank. Not sure where i'm making a mistake to filter such cases.
Note: Columns A1 and A2 are sometimes alphanumeric or just numeric.
Idea is to place a category in a separate column as "IDs are not updated" or "IDs are updated", so that by placing a simple filter on "IDs are not updated" we can identify cases that are blank in both columns.
发布评论
评论(2)
使用 > with 用于测试如果所有列均为
true
s-缺失值:Use
DataFrame.isna
withDataFrame.all
for test if all columns areTrue
s - missing values: