如何返回数据帧/系列中小于特定数字的列元素?

发布于 2025-01-12 15:12:19 字数 1204 浏览 0 评论 0原文

我有一个包含 2 列的数据框,我试图获取小于 5 的值。我试图让 pandas 返回小于 5 的值,但我得到的只是布尔值。

ab
014
125
236
data = pd.read_csv('test.csv')
answer = data < 5
print(answer)

我得到的结果:

       a     b
0      True  True
1      True  False
2      True  False

我想要的结果:

1 2 3 4

我似乎在 pandas 或 numpy 中找不到任何可以做到这一点的函数。我尝试通过列一一访问小于 5 的值,但它仍然返回布尔值

a = data["a"]
b = data["b"]
answer_column_a = a < 5
answer_column_b = b < 5
print(answer_column_a)
print(answer_column_b)

我得到的结果:

0    True
1    True
2    True
Name: a, dtype: bool

0    True
1    False
2    False
Name: b, dtype: bool

我想要的结果:

1 2 3
4

我知道的唯一处理特定列中的值的 pandas 函数是loc和iloc,但这两个函数似乎都不能做条件。有没有可以做到这一点的函数?到目前为止,我只知道 numpy 和 pandas,所以我不够了解是否有其他 Python 包具有可以执行此操作的内置函数。对于Python,我知道你可以通过类似 for i in a 或 for i in b 的代码从条件语句中获取值,但我不知道如何用 pandas 做到这一点。

I have a dataframe with 2 columns that I'm trying to get values that are less than 5. I'm trying to get pandas to return the values that are less than 5, but all I get in return are boolean values.

ab
014
125
236
data = pd.read_csv('test.csv')
answer = data < 5
print(answer)

The result that I got :

       a     b
0      True  True
1      True  False
2      True  False

The result that I want :

1 2 3 4

I can't seem to find any function in pandas or numpy that can do this. I tried to access the values that are less than 5 one by one through the columns but it still returns boolean as well

a = data["a"]
b = data["b"]
answer_column_a = a < 5
answer_column_b = b < 5
print(answer_column_a)
print(answer_column_b)

The result that I got :

0    True
1    True
2    True
Name: a, dtype: bool

0    True
1    False
2    False
Name: b, dtype: bool

The result that I want :

1 2 3
4

The only pandas function I know that deals with values from a specific column is loc and iloc, but both functions doesn't seem to be able to do conditionals. Are there functions out there that can do this? I only know numpy and pandas so far so I'm not knowledgeable enough to know if there are other Python packages that has a built in function that can do this. For Python, I understand you can get the values from conditionals through a code like for i in a or for i in b, but I don't know how to do that with pandas.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

唯憾梦倾城 2025-01-19 15:12:19

使用底层 numpy 数组:

a = df.to_numpy().ravel('F')
out = a[a<5]

输出: array([1, 2, 3, 4])

或使用 stack

s = df.T.stack()
out = s[s<5].to_list()

输出: [1, 2, 3, 4]

要获取每列的结果,您可以执行以下操作:

out = df.apply(lambda s: s[s<5].to_list()).to_list()

输出:[[1, 2, 3], [4]]

Use the underlying numpy array:

a = df.to_numpy().ravel('F')
out = a[a<5]

Output: array([1, 2, 3, 4])

Or using stack:

s = df.T.stack()
out = s[s<5].to_list()

Output: [1, 2, 3, 4]

To get the result per column, you could do:

out = df.apply(lambda s: s[s<5].to_list()).to_list()

Output: [[1, 2, 3], [4]]

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文