选择数据框中某个范围内的列

发布于 2025-01-12 15:27:07 字数 556 浏览 0 评论 0原文

我正在用 Python 处理数据框。我有 10 天的原始数据框。我已经划分了每天的数据框并尝试绘制。我在某些列（这里是 y 和 z）中有一些奇怪的值，所以我尝试使用“之间方法”来指定我的范围（0,100）。该代码正在运行，但我收到警告。有人可以帮我吗？

for df  in ((listofDF)):
    if len(df) != 0:
        f_df = df[df[' y'].between(0,100)]
        f_df = f_df[df[' z'].between(0,100)]
        maxTemp = f_df[' y']
        minTemp = f_df[' z']
        Time = f_df['x']
        plt.plot(x,y)
        plt.plot(x,z)

我收到的警告是，UserWarning: Boolean Series key will be reindexed to match DataFrame index. f_df = f_df[df['y']. Between(0,100)]

原文

I am working on dataframes in Python. I have original dataframe for 10 days. I have divided that dataframe for each day and trying to plot. I have some strange values in some columns(here y and z) ,so I am trying to use 'between method' to specify my range (0,100). The code is working, but I am getting warning. Can anyone help me please ?

for df  in ((listofDF)):
    if len(df) != 0:
        f_df = df[df[' y'].between(0,100)]
        f_df = f_df[df[' z'].between(0,100)]
        maxTemp = f_df[' y']
        minTemp = f_df[' z']
        Time = f_df['x']
        plt.plot(x,y)
        plt.plot(x,z)

The warning I am getting is, UserWarning: Boolean Series key will be reindexed to match DataFrame index.
f_df = f_df[df[' y'].between(0,100)]

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

荆棘i 2025-01-19 15:27:07

TL;DR 解决方案

将 f_df = f_df[df[' z']. Between(0, 100)] 更改为 f_df = f_df[f_df[' z' ]. Between(0, 100)]

您收到的警告是因为这一行：

f_df = f_df[df[' z'].between(0,100)]

There's an issue with this line, you can find it?

您正在使用 df 来索引 f_df。您在这里实质上要做的是获取 df 中 z 列介于 0 和 100 之间的行，因此假设 df 中的第 2 行和第 4 行。

但是，在 f_df 中，行可能完全不同。这意味着在 f_df（这是一个不同的数据帧）中，z 介于 0 到 100 之间的行是第 3 行和第 10 行。由于您使用 df 来索引 < code>f_df 从这个意义上来说（就像您获得满足 df 中的条件的索引，并使用这些索引从 f_df 中选择行），熊猫告诉您 f_df 的索引用于决定保留哪些行，这可能不是您想要的。

因此，当您对 df 进行过滤并返回第 1 行和第 10 行时，它将从 f_df 中选择第 1 行和第 10 行。或者更准确地说 - 它将选择索引 1 和 10。

在您的情况下，这就是您想要的，因为在创建 f_df 数据帧时会保留索引，如打印出来时左侧的索引所示。

>>> df = pd.DataFrame([('a', 1, 51), ('b', 51, 31)], columns=['letter', 'x', 'y'])
>>> f_df = df[df.x.between(0, 50)]
>>> f_df
  letter  x   y
0      a  1  51
>>> f_df = f_df[df.y.between(0, 50)]
<stdin>:1: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
>>> f_df
Empty DataFrame
Columns: [letter, x, y]
Index: []

TL;DR Solution

Change f_df = f_df[df[' z'].between(0, 100)] to f_df = f_df[f_df[' z'].between(0, 100)]

The warning you are getting is because of this line:

f_df = f_df[df[' z'].between(0,100)]

There's an issue with this line, can you spot it?

You're using df to index f_df. What you're essentially doing here is getting the rows where in df, column z is between 0 and 100, so let's say in df that's rows 2 and 4.

However, in f_df, the rows could be completely different. Meaning that in f_df (which is a different dataframe), the rows where z is between 0 and 100 are rows 3 and 10. Since you're using df to index f_df in this sense (as in you're getting the indices that satisfy the condition in df and using these indices to select rows from f_df), pandas is telling you that f_df's index is used to decide which rows to keep, which may not be what you want.

So when you do the filter on df and it returns rows 1 and 10, it will choose rows 1 and 10 from f_df. Or to be more accurate - it will choose the indices 1 and 10.

In your case, it is what you want because the indices are retained when you create the f_df dataframe, as seen by the indices on the left when you print it out.

>>> df = pd.DataFrame([('a', 1, 51), ('b', 51, 31)], columns=['letter', 'x', 'y'])
>>> f_df = df[df.x.between(0, 50)]
>>> f_df
  letter  x   y
0      a  1  51
>>> f_df = f_df[df.y.between(0, 50)]
<stdin>:1: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
>>> f_df
Empty DataFrame
Columns: [letter, x, y]
Index: []

回复收藏 0 原文

~没有更多了~