彼此之间10％以内的多个数据范围的查找和平均元素

发布于 2025-02-13 00:35:50 字数 777 浏览 1 评论 0原文

我有存储值是数据框的对象。我已经能够比较来自两个数据范围的值是否在彼此的10％以内。但是，我很难将其扩展到多个数据范围。此外，我想知道，如果数据范围的大小不一样，我应该如何处理这个问题？

def add_well_peak(self, *other):
       if len(self.Bell) == len(other.Bell): #if dataframes ARE the same size
           for k in range(len(self.Bell)):
               for j in range(len(other.Bell)):
                   if int(self.Size[k]) - int(self.Size[k])*(1/10) <= int(other.Size[j]) <= int(self.Size[k]) + int(self.Size[k])*(1/10):
#average all

例如，在下图中，有一些对象包含数据范围（即，self，other，其他1，其他2）。颜色表示匹配（即，彼此之间10％以内的值）。如果存在匹配，则为平均值。如果不存在匹配，仍然包括不比分的号码。我希望能够对大于或等于2的任何数量的对象（其他1，其他2，其他3，其他....）概括。任何帮助将不胜感激。请让我知道是否有任何尚不清楚。这是我第一次发布。再次感谢。

匹配数据

原文

I have objects that store values are dataframes. I have been able to compare if values from two dataframes are within 10% of each other. However, I am having difficulty extending this to multiple dataframes. Moreover, I am wondering how I should apporach this problem if dataframes are not the same size?

def add_well_peak(self, *other):
       if len(self.Bell) == len(other.Bell): #if dataframes ARE the same size
           for k in range(len(self.Bell)):
               for j in range(len(other.Bell)):
                   if int(self.Size[k]) - int(self.Size[k])*(1/10) <= int(other.Size[j]) <= int(self.Size[k]) + int(self.Size[k])*(1/10):
#average all

For example, in the image below, there are objects that contain dataframes (i.e., self, other1, other2). The colors represent matches (i.e, values that are within 10% of each other). If a match exist, then average the values. If a match does not exist still include the unmatch number. I want to be able to generalize this for any number of objects greater or equal than 2 (other 1, other 2, other 3, other ....). Any help would be appreciated. Please let me know if anything is unclear. This is my first time posting. Thanks again.

matching data

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

瘫痪情歌 2025-02-20 00:35:50

结果：

在图像的数据范围内使用我的解决方案，我会得到以下内容：
阈值离群值= 0.2：

               0
0       1.000000
1    1493.500000
2    5191.333333
3   35785.333333
4   43586.500000
5   78486.000000
6  100000.000000

阈值离群值= 0.5：

              0              1
0      1.000000            NaN
1   1493.500000            NaN
2   5191.333333            NaN
3  43586.500000   35785.333333
4  78486.000000  100000.000000

说明：

线是平均峰，代表这些峰值获得的不同值的列。我以为最大数量元素的平均发出是合法的，而threshold_outlier的其余部分是离群值（应该进行排序，您作为合法峰的可能性就越大，您越多，您就越多在左侧（第0列是最可能的））。例如，在0.5个离群阈值结果的第3行中，43586.500000是一个平均来自3个dataframes，而35785.3333333仅来自2个一。

问题：

解决方案非常复杂。我认为可以删除其中的很大一部分，但是我看不到目前的方式，而且由于它的工作原理，我一定会将优化留给您。
不过，我尝试评论自己的最好，如果您有任何疑问，请不要犹豫！

文件：

compinationLib.py

from __future__ import annotations
from typing import Dict, List
from Errors import *

class Combination():
    """
        Support class, to make things easier.  
        
        Contains a string `self.combination` which is a binary number stored as a string.
        This allows to test every combination of value (i.e. "101" on the list `[1, 2, 3]`
        would signify grouping `1` and `3` together).

        There are some methods:
        - `__add__` overrides the `+` operator
        - `compute_degree` gives how many `1`s are in the combination
        - `overlaps` allows to verify if combination overlaps (use the same value twice) 
        (i.e. `100` and `011` don't overlap, while `101` and `001` do)
    """
    def __init__(self, combination:str) -> Combination:
        self.combination:str = combination
        self.degree:int = self.compute_degree()
    def __add__(self, other: Combination) -> Combination:
        if self.combination == None:
            return other.copy()
        if other.combination == None:
            return self.copy()
        if self.overlaps(other):
            raise CombinationsOverlapError()
        result = ""
        for c1, c2 in zip(self.combination, other.combination):
            result += "1" if (c1 == "1" or c2 == "1") else "0"
        return Combination(result)
    def __str__(self) -> str:
        return self.combination
    def compute_degree(self) -> int:
        if self.combination == None:
            return 0
        degree = 0
        for bit in self.combination:
            if bit == "1":
                degree += 1
        return degree
    def copy(self) -> Combination:
        return Combination(self.combination)
    def overlaps(self, other:Combination) -> bool:
        for c1, c2 in zip(self.combination, other.combination):
            if c1 == "1" and c1 == c2:
                return True
        return False

class CombinationNode():
    """
        The main class.

        The main idea was to build a tree of possible "combinations of combinations":

        100-011 => 111

        |---010-001 => 111

        |---001-010 => 111

        At each node, the combination applied to the current list of values was to be acceptable
        (all within THREASHOLD_AVERAGING).

        Also, the shorter a path, the better the solution as it means it found a way to average
        a lot of the values, with the minimum amount of outliers possible, maybe by grouping
        the outliers together in a way that makes sense, ...

        - `populate` fills the tree automatically, with every solution possible
        - `path` is used mainly on leaves, to obtain the path taken to arrive there.
    """
    def __init__(self, combination:Combination) -> CombinationNode:
        self.combination:Combination = combination
        self.children:List[CombinationNode] = []
        self.parent:CombinationNode = None
        self.total_combination:Combination = combination
    def __str__(self) -> str:
        list_paths = self.recur_paths()
        list_paths = [",".join([combi.combination.combination for combi in path]) for path in list_paths]
        return "\n".join(list_paths)
    def add_child(self, child:CombinationNode) -> None:
        if child.combination.degree > self.combination.degree and not self.total_combination.overlaps(child.combination):
            raise ChildDegreeExceedParentDegreeError(f"{child.combination} > {self.combination}")
        self.children.append(child)
        child.parent = self
        child.total_combination += self.total_combination
    def path(self) -> List[CombinationNode]:
        path = []
        current = self
        while current.parent != None:
            path.append(current)
            current = current.parent
        path.append(current)
        return path[::-1]
    def populate(self, combination_dict:Dict[int, List[Combination]]) -> None:
        missing_degrees = len(self.combination.combination)-self.total_combination.degree
        if missing_degrees == 0:
            return
        for i in range(min(self.combination.degree, missing_degrees), 0, -1):
            for combination in combination_dict[i]:
                if not self.total_combination.overlaps(combination):
                    self.add_child(CombinationNode(combination))
        for child in self.children:
            child.populate(combination_dict)
    def recur_paths(self) -> List[List[CombinationNode]]:
        if len(self.children) == 0:
            return [self.path()]
        paths = []
        for child in self.children:
            for path in child.recur_paths():
                paths.append(path)
        return paths

errors.py

class ChildDegreeExceedParentDegreeError(Exception):
    pass
class CombinationsOverlapError(Exception):
    pass
class ToImplementError(Exception):
    pass
class UncompletePathError(Exception):
    pass

main.py

from typing import Dict, List, Set, Tuple, Union
import pandas as pd
from CombinationLib import *

best_depth:int = -1
best_path:List[CombinationNode] = []

THRESHOLD_OUTLIER = 0.2
THRESHOLD_AVERAGING = 0.1

def verif_averaging_pct(combination:Combination, values:List[float]) -> bool:
    """
        For a given combination of values, we must have all the values within 
        THRESHOLD_AVERAGING of the average of the combination
    """
    avg = 0
    for c,v in zip(combination.combination, values):
        if c == "1":
            avg += v
    avg /= combination.degree
    for c,v in zip(combination.combination, values):
        if c == "1"and (v > avg*(1+THRESHOLD_AVERAGING) or v < avg*(1-THRESHOLD_AVERAGING)):
            return False
    return True
def recursive_check(node:CombinationNode, depth:int, values:List[Union[float, int]]) -> None:
    """
        Here is where we preferencially ask for a small number of bigger groups
    """
    global best_depth
    global best_path
    # If there are more groups than the current best way to do, stop
    if best_depth != -1 and depth > best_depth:
        return
    # If all the values of the combination are not within THRESHOLD_AVERAGING, stop
    if not verif_averaging_pct(node.combination, values):
        return
    # If we finished the list of combinations, and this way is the best, keep it, stop
    if len(node.children) == 0:
        if best_depth == -1 or depth < best_depth:
            best_depth = depth
            best_path = node.path()
        return
    # If we are still not finished (not every value has been used), continue
    for cnode in node.children:
        recursive_check(cnode, depth+1, values)
def groups_from_list(values:List[Union[float, int]]) -> List[List[Union[float, int]]]:
    """
        From a list of values, get the smallest list of groups of elements 
        within THRESHOLD_AVERAGING of each other.
        It implies that we will try and recursively find the biggest group possible
        within the unsused values (i.e. groups with combinations of size [3, 1] are prefered 
        over [2, 2])
    """
    global best_depth
    global best_path

    groups:List[List[float]] = []
    
    # Generate all the combinations (I used binary for this)
    combination_dict:Dict[int, List[Combination]] = {}
    for i in range(1, 2**len(values)):
        combination = format(i, f"0{len(values)}b") # Here is the binary conversion
        counter = 0
        for c in combination:
            if c == "1":
                counter += 1
        if counter not in combination_dict:
            combination_dict[counter] = []
        combination_dict[counter].append(Combination(combination))

    # Generate of the combinations of combinations that use all values (without using one twice)
    combination_trees:List[List[CombinationNode]] = []
    for key in combination_dict:
        for combination in combination_dict[key]:
            cn = CombinationNode(combination)
            cn.populate(combination_dict)
            combination_trees.append(cn)

    best_depth = -1
    best_path = None

    for root in combination_trees:
        recursive_check(root, 0, values)

    # print(",".join([combination.combination.combination for combination in best_path]))

    for combination in best_path:
        temp = []
        for c,v in zip(combination.combination.combination, values):
            if c == "1":
                temp.append(v)
        groups.append(temp)

    return groups


def averages_from_groups(gs:List[List[Union[float, int]]]) -> List[float]:
    """Computing the averages of each group"""
    avgs:List[float] = []
    for group in gs:
        avg = 0
        for elt in group:
            avg += elt
        avg /= len(group)
        avgs.append(avg)
    return avgs
def end_check(ds:List[pd.DataFrame], ids:List[int]) -> bool:
    """Check if we finished consuming all the dataframes"""
    for d,i in zip(ds, ids):
        if i < len(d[0]):
            return False
    return True
def search(group:List[Union[float, int]], values_list:List[Union[float, int]]) -> List[int]:
    """Obtain all the indices corresponding to a set of values"""
    # We will get all the indices in values_list of the values in group
    # If a value is present in group, all the occurences of this value will be too, 
    # so we can use a set and search every occurence for each value.
    indices:List[int] = []
    group_set = set(group)
    for value in group_set:
        for i,v in enumerate(values_list):
            if value == v:
                indices.append(i)
    return indices
def threshold_grouper(total_list:List[Union[float, int]]) -> pd.DataFrame:
    """Building a 2D pd.DataFrame with the averages (x) and the outliers (y)"""
    result_list:List[List[Union[float, int]]] = [[total_list[0]]]

    result_index = 0
    total_index = 1
    while total_index < len(total_list):
        # Only checking if the bigger one is within THRESHOLD_OUTLIER of the little one.
        # If it is the case, the opposite is true too.
        # If yes, it is an outlier
        if result_list[result_index][0]*(1+THRESHOLD_OUTLIER) >= total_list[total_index]:
            result_list[result_index].append(total_list[total_index])
        # Else it is a new peak
        else:
            result_list.append([total_list[total_index]])
            result_index += 1
        total_index += 1

    result:pd.DataFrame = pd.DataFrame(result_list)
    return result
def dataframes_merger(dataframes:List[pd.DataFrame]) -> pd.DataFrame:
    """Merging the dataframes, with THRESHOLDS"""
    # Store the averages for the within 10% cells, in ascending order
    result = []

    # Keep tabs on where we are regarding each dataframe (needed for when we skip cells)
    curr_indices:List[int] = [0 for _ in range(len(dataframes))]
    # Repeat until all the cells in every dataframe has been seen once 
    while not end_check(dataframes, curr_indices):
        # Get the values of the current indices in the dataframes
        curr_values = [dataframe[0][i] for dataframe,i in zip(dataframes, curr_indices)]
        # Get the largest 10% groups from the current list of values
        groups = groups_from_list(curr_values)
        # Compute the average of these groups
        avgs = averages_from_groups(groups)
        # Obtain the minimum average...
        avg_min = min(avgs)
        # ... and its index
        avg_min_index = avgs.index(avg_min)
        # Then get the group corresponding to the minimum average
        avg_min_group = groups[avg_min_index]
        # Get the indices of the values included in this group
        indices_to_increment = search(avg_min_group, curr_values)
        # Add the average to the result merged list
        result.append(avg_min)
        # For every element in the average we added, increment the corresponding index
        for index in indices_to_increment:
            curr_indices[index] += 1
    
    # Re-assemble the dataframe, taking the threshold% around average into account
    result = threshold_grouper(result)
    print(result)

df1 = pd.DataFrame([1, 1487, 5144, 35293, 78486, 100000])
df2 = pd.DataFrame([1, 1500, 5144, 36278, 45968, 100000])
df3 = pd.DataFrame([1, 5286, 35785, 41205, 100000])
dataframes_merger([df3, df2, df1])

Results:

Using my solution on the dataframes of your image, I get the following:
Threshold outlier = 0.2:

               0
0       1.000000
1    1493.500000
2    5191.333333
3   35785.333333
4   43586.500000
5   78486.000000
6  100000.000000

Threshold outlier = 0.5:

              0              1
0      1.000000            NaN
1   1493.500000            NaN
2   5191.333333            NaN
3  43586.500000   35785.333333
4  78486.000000  100000.000000

Explanations:

The lines are averaged peaks, the columns representing the different values obtained for these peaks. I assumed the average emanating from the biggest number of elements was the legitimate one, and the rest within the THRESHOLD_OUTLIER were the outliers (should be sorted, the more probable you are as a legitimate peak, the more you are on the left (the 0th column is the most probable)). For instance, on line 3 of the 0.5 outlier threshold results, 43586.500000 is an average coming from 3 dataframes, while 35785.333333 comes from only 2, thus the most probable is the first one.

Issues:

The solution is quite complicated. I assume a big part of it could be removed, but I can't see how for the moment, and as it works, I'll certainly leave the optimization to you.
Still, I tried commenting my best, and if you have any question, do not hesitate!

Files:

CombinationLib.py

from __future__ import annotations
from typing import Dict, List
from Errors import *

class Combination():
    """
        Support class, to make things easier.  
        
        Contains a string `self.combination` which is a binary number stored as a string.
        This allows to test every combination of value (i.e. "101" on the list `[1, 2, 3]`
        would signify grouping `1` and `3` together).

        There are some methods:
        - `__add__` overrides the `+` operator
        - `compute_degree` gives how many `1`s are in the combination
        - `overlaps` allows to verify if combination overlaps (use the same value twice) 
        (i.e. `100` and `011` don't overlap, while `101` and `001` do)
    """
    def __init__(self, combination:str) -> Combination:
        self.combination:str = combination
        self.degree:int = self.compute_degree()
    def __add__(self, other: Combination) -> Combination:
        if self.combination == None:
            return other.copy()
        if other.combination == None:
            return self.copy()
        if self.overlaps(other):
            raise CombinationsOverlapError()
        result = ""
        for c1, c2 in zip(self.combination, other.combination):
            result += "1" if (c1 == "1" or c2 == "1") else "0"
        return Combination(result)
    def __str__(self) -> str:
        return self.combination
    def compute_degree(self) -> int:
        if self.combination == None:
            return 0
        degree = 0
        for bit in self.combination:
            if bit == "1":
                degree += 1
        return degree
    def copy(self) -> Combination:
        return Combination(self.combination)
    def overlaps(self, other:Combination) -> bool:
        for c1, c2 in zip(self.combination, other.combination):
            if c1 == "1" and c1 == c2:
                return True
        return False

class CombinationNode():
    """
        The main class.

        The main idea was to build a tree of possible "combinations of combinations":

        100-011 => 111

        |---010-001 => 111

        |---001-010 => 111

        At each node, the combination applied to the current list of values was to be acceptable
        (all within THREASHOLD_AVERAGING).

        Also, the shorter a path, the better the solution as it means it found a way to average
        a lot of the values, with the minimum amount of outliers possible, maybe by grouping
        the outliers together in a way that makes sense, ...

        - `populate` fills the tree automatically, with every solution possible
        - `path` is used mainly on leaves, to obtain the path taken to arrive there.
    """
    def __init__(self, combination:Combination) -> CombinationNode:
        self.combination:Combination = combination
        self.children:List[CombinationNode] = []
        self.parent:CombinationNode = None
        self.total_combination:Combination = combination
    def __str__(self) -> str:
        list_paths = self.recur_paths()
        list_paths = [",".join([combi.combination.combination for combi in path]) for path in list_paths]
        return "\n".join(list_paths)
    def add_child(self, child:CombinationNode) -> None:
        if child.combination.degree > self.combination.degree and not self.total_combination.overlaps(child.combination):
            raise ChildDegreeExceedParentDegreeError(f"{child.combination} > {self.combination}")
        self.children.append(child)
        child.parent = self
        child.total_combination += self.total_combination
    def path(self) -> List[CombinationNode]:
        path = []
        current = self
        while current.parent != None:
            path.append(current)
            current = current.parent
        path.append(current)
        return path[::-1]
    def populate(self, combination_dict:Dict[int, List[Combination]]) -> None:
        missing_degrees = len(self.combination.combination)-self.total_combination.degree
        if missing_degrees == 0:
            return
        for i in range(min(self.combination.degree, missing_degrees), 0, -1):
            for combination in combination_dict[i]:
                if not self.total_combination.overlaps(combination):
                    self.add_child(CombinationNode(combination))
        for child in self.children:
            child.populate(combination_dict)
    def recur_paths(self) -> List[List[CombinationNode]]:
        if len(self.children) == 0:
            return [self.path()]
        paths = []
        for child in self.children:
            for path in child.recur_paths():
                paths.append(path)
        return paths

Errors.py

class ChildDegreeExceedParentDegreeError(Exception):
    pass
class CombinationsOverlapError(Exception):
    pass
class ToImplementError(Exception):
    pass
class UncompletePathError(Exception):
    pass

main.py

from typing import Dict, List, Set, Tuple, Union
import pandas as pd
from CombinationLib import *

best_depth:int = -1
best_path:List[CombinationNode] = []

THRESHOLD_OUTLIER = 0.2
THRESHOLD_AVERAGING = 0.1

def verif_averaging_pct(combination:Combination, values:List[float]) -> bool:
    """
        For a given combination of values, we must have all the values within 
        THRESHOLD_AVERAGING of the average of the combination
    """
    avg = 0
    for c,v in zip(combination.combination, values):
        if c == "1":
            avg += v
    avg /= combination.degree
    for c,v in zip(combination.combination, values):
        if c == "1"and (v > avg*(1+THRESHOLD_AVERAGING) or v < avg*(1-THRESHOLD_AVERAGING)):
            return False
    return True
def recursive_check(node:CombinationNode, depth:int, values:List[Union[float, int]]) -> None:
    """
        Here is where we preferencially ask for a small number of bigger groups
    """
    global best_depth
    global best_path
    # If there are more groups than the current best way to do, stop
    if best_depth != -1 and depth > best_depth:
        return
    # If all the values of the combination are not within THRESHOLD_AVERAGING, stop
    if not verif_averaging_pct(node.combination, values):
        return
    # If we finished the list of combinations, and this way is the best, keep it, stop
    if len(node.children) == 0:
        if best_depth == -1 or depth < best_depth:
            best_depth = depth
            best_path = node.path()
        return
    # If we are still not finished (not every value has been used), continue
    for cnode in node.children:
        recursive_check(cnode, depth+1, values)
def groups_from_list(values:List[Union[float, int]]) -> List[List[Union[float, int]]]:
    """
        From a list of values, get the smallest list of groups of elements 
        within THRESHOLD_AVERAGING of each other.
        It implies that we will try and recursively find the biggest group possible
        within the unsused values (i.e. groups with combinations of size [3, 1] are prefered 
        over [2, 2])
    """
    global best_depth
    global best_path

    groups:List[List[float]] = []
    
    # Generate all the combinations (I used binary for this)
    combination_dict:Dict[int, List[Combination]] = {}
    for i in range(1, 2**len(values)):
        combination = format(i, f"0{len(values)}b") # Here is the binary conversion
        counter = 0
        for c in combination:
            if c == "1":
                counter += 1
        if counter not in combination_dict:
            combination_dict[counter] = []
        combination_dict[counter].append(Combination(combination))

    # Generate of the combinations of combinations that use all values (without using one twice)
    combination_trees:List[List[CombinationNode]] = []
    for key in combination_dict:
        for combination in combination_dict[key]:
            cn = CombinationNode(combination)
            cn.populate(combination_dict)
            combination_trees.append(cn)

    best_depth = -1
    best_path = None

    for root in combination_trees:
        recursive_check(root, 0, values)

    # print(",".join([combination.combination.combination for combination in best_path]))

    for combination in best_path:
        temp = []
        for c,v in zip(combination.combination.combination, values):
            if c == "1":
                temp.append(v)
        groups.append(temp)

    return groups


def averages_from_groups(gs:List[List[Union[float, int]]]) -> List[float]:
    """Computing the averages of each group"""
    avgs:List[float] = []
    for group in gs:
        avg = 0
        for elt in group:
            avg += elt
        avg /= len(group)
        avgs.append(avg)
    return avgs
def end_check(ds:List[pd.DataFrame], ids:List[int]) -> bool:
    """Check if we finished consuming all the dataframes"""
    for d,i in zip(ds, ids):
        if i < len(d[0]):
            return False
    return True
def search(group:List[Union[float, int]], values_list:List[Union[float, int]]) -> List[int]:
    """Obtain all the indices corresponding to a set of values"""
    # We will get all the indices in values_list of the values in group
    # If a value is present in group, all the occurences of this value will be too, 
    # so we can use a set and search every occurence for each value.
    indices:List[int] = []
    group_set = set(group)
    for value in group_set:
        for i,v in enumerate(values_list):
            if value == v:
                indices.append(i)
    return indices
def threshold_grouper(total_list:List[Union[float, int]]) -> pd.DataFrame:
    """Building a 2D pd.DataFrame with the averages (x) and the outliers (y)"""
    result_list:List[List[Union[float, int]]] = [[total_list[0]]]

    result_index = 0
    total_index = 1
    while total_index < len(total_list):
        # Only checking if the bigger one is within THRESHOLD_OUTLIER of the little one.
        # If it is the case, the opposite is true too.
        # If yes, it is an outlier
        if result_list[result_index][0]*(1+THRESHOLD_OUTLIER) >= total_list[total_index]:
            result_list[result_index].append(total_list[total_index])
        # Else it is a new peak
        else:
            result_list.append([total_list[total_index]])
            result_index += 1
        total_index += 1

    result:pd.DataFrame = pd.DataFrame(result_list)
    return result
def dataframes_merger(dataframes:List[pd.DataFrame]) -> pd.DataFrame:
    """Merging the dataframes, with THRESHOLDS"""
    # Store the averages for the within 10% cells, in ascending order
    result = []

    # Keep tabs on where we are regarding each dataframe (needed for when we skip cells)
    curr_indices:List[int] = [0 for _ in range(len(dataframes))]
    # Repeat until all the cells in every dataframe has been seen once 
    while not end_check(dataframes, curr_indices):
        # Get the values of the current indices in the dataframes
        curr_values = [dataframe[0][i] for dataframe,i in zip(dataframes, curr_indices)]
        # Get the largest 10% groups from the current list of values
        groups = groups_from_list(curr_values)
        # Compute the average of these groups
        avgs = averages_from_groups(groups)
        # Obtain the minimum average...
        avg_min = min(avgs)
        # ... and its index
        avg_min_index = avgs.index(avg_min)
        # Then get the group corresponding to the minimum average
        avg_min_group = groups[avg_min_index]
        # Get the indices of the values included in this group
        indices_to_increment = search(avg_min_group, curr_values)
        # Add the average to the result merged list
        result.append(avg_min)
        # For every element in the average we added, increment the corresponding index
        for index in indices_to_increment:
            curr_indices[index] += 1
    
    # Re-assemble the dataframe, taking the threshold% around average into account
    result = threshold_grouper(result)
    print(result)

df1 = pd.DataFrame([1, 1487, 5144, 35293, 78486, 100000])
df2 = pd.DataFrame([1, 1500, 5144, 36278, 45968, 100000])
df3 = pd.DataFrame([1, 5286, 35785, 41205, 100000])
dataframes_merger([df3, df2, df1])

回复收藏 0 原文

~没有更多了~