如何迭代数据帧并获取每个组的输出?现在我只得到一行,一组无法识别

发布于 2025-01-15 01:54:09 字数 2962 浏览 5 评论 0原文

我需要根据多个索引(“治疗”、“个人”、“制度”)迭代数据框中的每个数据集。我想使用 x 和 y 对每个治疗、个体和方案应用曲线拟合。目前我只能使用一个索引。

这是数据框

df_tot

       Treatment        y        x      individual   regime
0       White       21.982733   800   Data20210608  Ctrl
1       White       21.973003   800   Data20210508  Ctrl
2       White       21.968242   800   Data20210408  Ctrl
3       White       21.982733   600   Data20210608  Ctrl
4       White       21.973003   600   Data20210508  Ctrl
5       White       21.968242   600   Data20210408  Ctrl
6       White       21.982733   500   Data20210608  Ctrl
7       White       21.973003   500   Data20210508  Ctrl
5       White       21.968242   500   Data20210408  Ctrl
15      White_FR    22.139293   800   Data20210608  Ctrl
16      White_FR    22.159840   800   Data20210508  Ctrl
17      White_FR    22.162254   800   Data20210408  Ctrl
18      White_FR    22.139293   600   Data20210608  Ctrl
19      White_FR    22.159840   600   Data20210508  Ctrl
20      White_FR    22.162254   600   Data20210408  Ctrl
21      White_FR    22.139293   500   Data20210608  Ctrl
22      White_FR    22.159840   500   Data20210508  Ctrl
23      White_FR    22.162254   500   Data20210408  Ctrl
2500    White       1.864671    800   Data20210708  T
2501    White       1.871709    800   Data20210608  T
2502    White       1.884706    800   Data20210508  T
2503    White       1.872854    600   Data20210708  T
2504    White       1.872233    600   Data20210608  T
2505    White       1.872344    600   Data20210508  T
2506    White       1.872854    500   Data20210708  T
2507    White       1.872233    500   Data20210608  T
2508    White       1.872344    500   Data20210508  T
2519    White_FR    1.882861    800 Data20210708    T
2520    White_FR    1.917002    800 Data20210608    T
2521    White_FR    1.903067    800 Data20210508    T
2519    White_FR    1.882861    600 Data20210708    T
2520    White_FR    1.917002    600 Data20210608    T
2521    White_FR    1.903067    600 Data20210508    T
2519    White_FR    1.882861    500 Data20210708    T
2520    White_FR    1.917002    500 Data20210608    T
2521    White_FR    1.903067    500 Data20210508    T

这是代码:

 variables={'Spectrum':Spectrum,  date':date, 'regime':regime, 
             'slope':float} 
 results = pd.DataFrame(variables, index=[])


 group_df = df_tot.groupby(["Spectrum", "date", "regime", "PPFD", 
              "start"])

 def model(x, slope):
    return  (slope*x) + start


 group_df.apply(lambda x : curve_fit(model, x.loc[:, 'PPFD'], 
                x.loc[:, 'Photo']))

 new_row = {'Spectrum': Spectrum, date':date, 'regime':regime, 'slope': 
             popt[0]}  ## adding Spectrum gives an error
                        #name 'Spectrum' is not defined
 results=results.append(new_row, ignore_index=True)

现在我明白了

 results
        date       regime  slope
 0    Data20210608 Ctrl 0.05

I need to iterate through each dataset in the dataframe based on multiple indexes ('Treatment', 'individual', 'regime'). I want to apply curve fit using x and y for each Treatment, individual and regime. Currently I am able to use only one index.

This is the dataframe

df_tot

       Treatment        y        x      individual   regime
0       White       21.982733   800   Data20210608  Ctrl
1       White       21.973003   800   Data20210508  Ctrl
2       White       21.968242   800   Data20210408  Ctrl
3       White       21.982733   600   Data20210608  Ctrl
4       White       21.973003   600   Data20210508  Ctrl
5       White       21.968242   600   Data20210408  Ctrl
6       White       21.982733   500   Data20210608  Ctrl
7       White       21.973003   500   Data20210508  Ctrl
5       White       21.968242   500   Data20210408  Ctrl
15      White_FR    22.139293   800   Data20210608  Ctrl
16      White_FR    22.159840   800   Data20210508  Ctrl
17      White_FR    22.162254   800   Data20210408  Ctrl
18      White_FR    22.139293   600   Data20210608  Ctrl
19      White_FR    22.159840   600   Data20210508  Ctrl
20      White_FR    22.162254   600   Data20210408  Ctrl
21      White_FR    22.139293   500   Data20210608  Ctrl
22      White_FR    22.159840   500   Data20210508  Ctrl
23      White_FR    22.162254   500   Data20210408  Ctrl
2500    White       1.864671    800   Data20210708  T
2501    White       1.871709    800   Data20210608  T
2502    White       1.884706    800   Data20210508  T
2503    White       1.872854    600   Data20210708  T
2504    White       1.872233    600   Data20210608  T
2505    White       1.872344    600   Data20210508  T
2506    White       1.872854    500   Data20210708  T
2507    White       1.872233    500   Data20210608  T
2508    White       1.872344    500   Data20210508  T
2519    White_FR    1.882861    800 Data20210708    T
2520    White_FR    1.917002    800 Data20210608    T
2521    White_FR    1.903067    800 Data20210508    T
2519    White_FR    1.882861    600 Data20210708    T
2520    White_FR    1.917002    600 Data20210608    T
2521    White_FR    1.903067    600 Data20210508    T
2519    White_FR    1.882861    500 Data20210708    T
2520    White_FR    1.917002    500 Data20210608    T
2521    White_FR    1.903067    500 Data20210508    T

This is the code:

 variables={'Spectrum':Spectrum,  date':date, 'regime':regime, 
             'slope':float} 
 results = pd.DataFrame(variables, index=[])


 group_df = df_tot.groupby(["Spectrum", "date", "regime", "PPFD", 
              "start"])

 def model(x, slope):
    return  (slope*x) + start


 group_df.apply(lambda x : curve_fit(model, x.loc[:, 'PPFD'], 
                x.loc[:, 'Photo']))

 new_row = {'Spectrum': Spectrum, date':date, 'regime':regime, 'slope': 
             popt[0]}  ## adding Spectrum gives an error
                        #name 'Spectrum' is not defined
 results=results.append(new_row, ignore_index=True)

Now I get

 results
        date       regime  slope
 0    Data20210608 Ctrl 0.05

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

油饼 2025-01-22 01:54:09

您绝对可以迭代具有超过 1 个索引的数据帧。

首先,您的代码存在一些主要问题:

  1. 您的数据的输出)
  2. 在您的问题中添加一些玩具数据,以便我们可以使用它来找到您面临的问题的解决方案(而不是 曾经使用 del 删除数据框中的某些列,使用 drop 或使用 lociloc 选择除一列之外的所有列。
  3. 不要写 all= [df_Ctrl, df_FR]all 在 python 中有指定的含义,你应该选择其他名称。
  4. for g in all: #if I put for key, g in all, all 这里是两个元素的列表,这里没有什么可以解压的
  5. 你的数据帧不是多重索引的,如果你愿意的话,你必须修改它。
  6. 我强烈建议您不要使用 [[]] 选择数据帧的子数据帧,而是使用 lociloc< /a> 相反。

如果我正确理解您的问题,您希望根据三个数据对数据帧的元素进行分组:“治疗”、“个人”、“制度”,然后对于每个分组值,您希望对 x 和 y 执行指定的操作。您可以适应这一点:

group_df = df_tot.groupby(["Treatment", "individual", "regime"])
curved_df = group_df.apply(lambda x : curve_fit(model, x.loc[:, 'x'], x.loc[:, 'y']))

显然,由于您没有提供模型或 curve_fit,我无法测试它是否正确。但主要的想法就在这里,你可以根据它进行工作。

You can absolutely iterate through a dataframe with more than 1 index.

First of all, there are some major issues with your code :

  1. Add some toy data with your problem, so we can play with it to find a solution to the problem you're facing (and not an output of your data)
  2. Don't ever use del to delete some columns in a dataframe, use drop or select all but one using loc or iloc.
  3. Don't write all= [df_Ctrl, df_FR], all has a specified meaning in python, you should pick an other name.
  4. for g in all: #if I put for key, g in all, all here a list of two elements, there is nothing to unpack here
  5. Your dataframe is not multiindexed, you have to modify it if you want so.
  6. I strongly encourage you to not use [[]]to select a sub dataframe of a dataframe, but using loc or iloc instead.

If I understand your problem correctly, you want to group elements of your dataframe depending of three data : 'Treatment', 'individual', 'regime', then for each grouped values, you want to perform a specified operation on x and y. You can adapt for this :

group_df = df_tot.groupby(["Treatment", "individual", "regime"])
curved_df = group_df.apply(lambda x : curve_fit(model, x.loc[:, 'x'], x.loc[:, 'y']))

Obviously since you didn't provide model nor curve_fit, I can't test if it's correct or not. But the main idea is here and you can work from it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文