根据指定的属性分组成组
我需要以这样的方式对数据进行分组,如果A1列的相邻值之间的差等于相同的预指定值,则它们属于同一组。如果两个相邻元素之间的值不同,则所有后续数据都属于另一组。例如,
import pandas as pd
import numpy as np
data = [
[5, 2],
[100, 23],
[101, -2],
[303, 9],
[304, 4],
[709, 14],
[710, 3],
[711, 3],
[988, 21]
]
columns = ['a1', 'a2']
df = pd.DataFrame(data=data, columns=columns)
如果列A1的元素等于一个,则我有这样的数据表,则它们属于同一组,并且在此示例中的答案将如下:
[[0], [1, 2], [3, 4], [5, 6, 7], [8]]
输出列表存储与行相对应的索引来自DF。
排序A1列也可能有用。感谢您的帮助!
I need to group the data in such a way that if the difference between the adjacent values from column a1 was equal to the same pre-specified value, then they belong to the same group. If the value between two adjacent elements is different, then all subsequent data belong to a different group. For example, I have such a data table
import pandas as pd
import numpy as np
data = [
[5, 2],
[100, 23],
[101, -2],
[303, 9],
[304, 4],
[709, 14],
[710, 3],
[711, 3],
[988, 21]
]
columns = ['a1', 'a2']
df = pd.DataFrame(data=data, columns=columns)
If the difference between the elements of column a1 is equal to one, then they belong to the same group and the answer in this example will be the following:
[[0], [1, 2], [3, 4], [5, 6, 7], [8]]
The output list stores indexes that correspond to rows from df.
It may also be useful that column a1 is ordered. Thank you for your help!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
假设您的数据框是由
a1
排序的,并且我正确理解了您的问题,我认为您可以做这样的事情:njit
decorator numba 使循环方法有效。Assuming that your data frame is sorted by
a1
and that I understood your problem correctly, I think you could do something like this:The
njit
decorator fromnumba
makes the looping approach efficient.我们通过“ A1”列对数据框进行排序,然后找到相邻值的差异。现在我们有所不同,我们可以开始分组。
结果:
We are sorting the the Dataframe by "a1" column, then finding the difference of adjacent values. Now we have the difference, we can start grouping.
Result:
上面的答案促使我找到了一个简短而简单的代码以获取答案。非常感谢!
The answers above pushed me to a fairly short and simple code to get an answer. Thank you all very much!