在数据帧列表中循环循环时,与TypeError的问题

发布于 2025-01-18 12:21:08 字数 1274 浏览 1 评论 0原文

我有一个数据帧dataframes名称keeplist和dict hydrocap的列表。

我正在尝试基于列名keeplist在应用列循环中的where函数以将列中的值替换为字典中的列函数时(为其字典值替换为字典值(为其),相应的键)如果它大于字典值。问题是我遇到了typeError:'> ='在“ str”和'int'的实例之间不支持我不确定如何解决问题。

keeplist = ['BOUND','GCOUL','CHIEF','ROCKY','WANAP','PRIRA','LGRAN','LMONU','ICEHA','MCNAR','DALLE']
HydroCap = {'BOUND':55000,'GCOUL':280000,'CHIEF':219000,'ROCKY':220000,'WANAP':161000,'PRIRA':162000,'LGRAN':130000,'LMONU':130000,'ICEHA':106000,'MCNAR':232000,'DALLE':375000}
for i in dataframes:
  for c in i[keeplist]:
    c = np.where(c >= HydroCap[c], HydroCap[c], c)

任何朝着正确方向的推动都将不胜感激。我认为问题在于,它期望Hydrocap [1]而不是hydrocap [C],但这是一个hunch。

dataFrame [0]的第一列列

      Week  Month  Day  Year         BOUND          GCOUL          CHIEF  \
0        1      8    5  1979  44999.896673  161241.036388  166497.578098   
1        2      8   12  1979  15309.259762   58219.122747   63413.204052   
2        3      8   19  1979  15316.965781   56072.024363   60606.956215   
3        4      8   26  1979  14371.269016   58574.003087   63311.569888 

I have a list of data frames dataframes a list of names keeplist and a dict Hydrocap.

I am trying to loop through the columns of each data frame based on the column names keeplist while applying a where function in the column loop to replace the value in the column to that in the dictionary value (for its respective key) if it is greater than the dictionary value. The issue is I run into a TypeError: '>=' not supported between instances of 'str' and 'int' where I am not sure how to solve the issue.

keeplist = ['BOUND','GCOUL','CHIEF','ROCKY','WANAP','PRIRA','LGRAN','LMONU','ICEHA','MCNAR','DALLE']
HydroCap = {'BOUND':55000,'GCOUL':280000,'CHIEF':219000,'ROCKY':220000,'WANAP':161000,'PRIRA':162000,'LGRAN':130000,'LMONU':130000,'ICEHA':106000,'MCNAR':232000,'DALLE':375000}
for i in dataframes:
  for c in i[keeplist]:
    c = np.where(c >= HydroCap[c], HydroCap[c], c)

Any push in the right direction would be greatly appreciated. I think the issue is that it is expecting an index value in place for HydroCap[1] instead of HydroCap[c] but, that is a hunch.

first 7 columns of dataframe[0]

      Week  Month  Day  Year         BOUND          GCOUL          CHIEF  \
0        1      8    5  1979  44999.896673  161241.036388  166497.578098   
1        2      8   12  1979  15309.259762   58219.122747   63413.204052   
2        3      8   19  1979  15316.965781   56072.024363   60606.956215   
3        4      8   26  1979  14371.269016   58574.003087   63311.569888 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

枉心 2025-01-25 12:21:08
import pandas as pd
import numpy as np

# Since I don't have all of the dataframes, I just use the sample you shared
df = pd.read_csv('dataframe.tsv', sep = "\t")

# Note, I've changed some values so you can see something actually happens
keeplist = ['BOUND','GCOUL','CHIEF']
HydroCap = {'BOUND':5500,'GCOUL':280000,'CHIEF':21900}

# The inside of the loop has been changed to accomplish the actual goal
# First, there are now two variables inside the loop: col, and c
# col is the column
# c represents a single element in that column at a time

# The code operates over a column at a time,
# using a list comprehension to cycle over each element
# and replace the full column with the new values at once
for col in df[keeplist]:
    df[col] = [np.where(c >= HydroCap[col], HydroCap[col], c) for c in df[col]]

生产的:

df
每年约束5500.0GCOUL酋长
0185197921900.0161241.03638819
1281219795500.058219.12274721900.0
2238要么5500.01979
数据

框架 需要一次进行整列,或将值重新分配到由行和列坐标指定的单元格。在原始代码中重新分配c变量 - 我可以指示它表​​示您所考虑的单元格值,而不是像这种情况一样的列名,不会改变dataframe中的任何内容。

import pandas as pd
import numpy as np

# Since I don't have all of the dataframes, I just use the sample you shared
df = pd.read_csv('dataframe.tsv', sep = "\t")

# Note, I've changed some values so you can see something actually happens
keeplist = ['BOUND','GCOUL','CHIEF']
HydroCap = {'BOUND':5500,'GCOUL':280000,'CHIEF':21900}

# The inside of the loop has been changed to accomplish the actual goal
# First, there are now two variables inside the loop: col, and c
# col is the column
# c represents a single element in that column at a time

# The code operates over a column at a time,
# using a list comprehension to cycle over each element
# and replace the full column with the new values at once
for col in df[keeplist]:
    df[col] = [np.where(c >= HydroCap[col], HydroCap[col], c) for c in df[col]]

Which produces:

df
WeekMonthDayYearBOUNDGCOULCHIEF
018519795500.0161241.03638821900.0
1281219795500.058219.12274721900.0
2381919795500.056072.02436321900.0
3482619795500.058574.00308721900.0

In order to replace elements in a dataframe, you either need to go a whole column at a time, or reassign values to a cell specified by row and column coordinates. Reassigning the c variable in your original code—assuming it represented the cell values you had in mind, and not the column name as was the case—doesn't alter anything in the dataframe.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文