从python中的熊猫中的数据框中获取特定的列和行数据
我正在使用Python中的Panadas使用数据框架。我已经对表进行了排序并创建了一些额外的列: https://i.sstatic.net/y6lkn.png
{'Part Number': ['K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2'],
'Date': [Timestamp('2021-05-17 00:00:00'),
Timestamp('2021-05-23 00:00:00'),
Timestamp('2021-07-08 00:00:00'),
Timestamp('2021-08-17 00:00:00'),
Timestamp('2021-08-17 00:00:00'),
Timestamp('2021-10-18 00:00:00'),
Timestamp('2021-12-18 00:00:00'),
Timestamp('2021-12-20 00:00:00'),
Timestamp('2022-02-10 00:00:00'),
Timestamp('2022-03-31 00:00:00'),
Timestamp('2021-10-04 00:00:00'),
Timestamp('2021-10-18 00:00:00'),
Timestamp('2021-11-03 00:00:00'),
Timestamp('2021-11-03 00:00:00'),
Timestamp('2021-11-17 00:00:00'),
Timestamp('2021-11-24 00:00:00'),
Timestamp('2021-11-27 00:00:00'),
Timestamp('2021-12-22 00:00:00'),
Timestamp('2021-12-24 00:00:00'),
Timestamp('2022-03-21 00:00:00')],
'Code': ['SF22',
'KFS3',
'3FFS',
'Replacement needed',
'LA52',
'K2KA',
'Belt Broke',
'QET6',
'QET6',
'P0SF',
'Testing Broken',
'DP2L',
'SR2F',
'JKO2',
'DP2L',
'A2BF',
'KLL2',
'Light Off',
'A3SA',
'LA52'],
'Fix': ['na',
'na',
'na',
'Custom Status',
'na',
'na',
'Remade',
'na',
'na',
'na',
'Testing Procedure Fixed',
'na',
'na',
'na',
'na',
'na',
'na',
'Light Repair',
'na',
'na'],
'Fixed': ['No',
'No',
'No',
'Yes',
'No',
'No',
'Yes',
'No',
'No',
'No',
'Yes',
'No',
'No',
'No',
'No',
'No',
'No',
'Yes',
'No',
'No'],
'Combined': ['SF22',
'KFS3',
'3FFS',
'Replacement needed',
'LA52',
'K2KA',
'Belt Broke',
'QET6',
'QET6',
'P0SF',
'Testing Broken',
'DP2L',
'SR2F',
'JKO2',
'DP2L',
'A2BF',
'KLL2',
'Light Off',
'A3SA',
'LA52']}
我按日期对数据框进行了排序,现在我想创建一个循环,该循环逐行顺序排列。在循环中,如果“固定”列中的行是“否”,我想将“代码”列中该行中的值附加到列表(我称为list_test)。然后,当“固定”列中的行变为“是”时,我想创建一个新列表变量,该变量是List_test的副本。
然后,我想将list_test清除为一个空列表,以便它可以在列下重复该过程(每次都有“修复”时清除自己)。
在上面的示例表中,我希望输出符合以下路线:
- fixed_before_3 = [“ sf22”,“ kfs3”,“ 3ffs”]
- fixe_before_6 = [“ la52”,“ k2ka”,“ k2ka”]
- fixed_before_10 = [qet6 “,“ Qet6”,“ p0sf”]
- fixed_before_17 = [“ dp2l”,“ sr2f”,“ jko2”,“ dp2l”,“ dp2l”,“ a2bf”,“ kll2”]
这是我试图解决这个问题的一种方式:
list_test = []
var_test = {}
for index in df.index:
var_test[index] = "Fixed_Before_" + str(index)
if df['Fixed'][index] == 'No':
list_test.append(df['Code'])
if df['Fixed'][index] == 'Yes':
var_test[index] = list_test
list_test = []
list_test
尽管当我运行代码时,输出()是一个很大的列,看起来它在我的列中包含了几次,而不是上面包含的输出。我认为我的问题可能存在:
- 的方式
- 在整个数据框中迭代有关数据框的条件语句
- 也许list_test.append(df ['code'])给了我整列,而不是我条件语句中该行中列的值?
- 使用字典在我的循环中创建新变量。
I'm working with a dataframe using panadas in python. I have sorted the table and created some extra columns: https://i.sstatic.net/Y6lkN.png
{'Part Number': ['K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'K4SD',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2',
'QOL2'],
'Date': [Timestamp('2021-05-17 00:00:00'),
Timestamp('2021-05-23 00:00:00'),
Timestamp('2021-07-08 00:00:00'),
Timestamp('2021-08-17 00:00:00'),
Timestamp('2021-08-17 00:00:00'),
Timestamp('2021-10-18 00:00:00'),
Timestamp('2021-12-18 00:00:00'),
Timestamp('2021-12-20 00:00:00'),
Timestamp('2022-02-10 00:00:00'),
Timestamp('2022-03-31 00:00:00'),
Timestamp('2021-10-04 00:00:00'),
Timestamp('2021-10-18 00:00:00'),
Timestamp('2021-11-03 00:00:00'),
Timestamp('2021-11-03 00:00:00'),
Timestamp('2021-11-17 00:00:00'),
Timestamp('2021-11-24 00:00:00'),
Timestamp('2021-11-27 00:00:00'),
Timestamp('2021-12-22 00:00:00'),
Timestamp('2021-12-24 00:00:00'),
Timestamp('2022-03-21 00:00:00')],
'Code': ['SF22',
'KFS3',
'3FFS',
'Replacement needed',
'LA52',
'K2KA',
'Belt Broke',
'QET6',
'QET6',
'P0SF',
'Testing Broken',
'DP2L',
'SR2F',
'JKO2',
'DP2L',
'A2BF',
'KLL2',
'Light Off',
'A3SA',
'LA52'],
'Fix': ['na',
'na',
'na',
'Custom Status',
'na',
'na',
'Remade',
'na',
'na',
'na',
'Testing Procedure Fixed',
'na',
'na',
'na',
'na',
'na',
'na',
'Light Repair',
'na',
'na'],
'Fixed': ['No',
'No',
'No',
'Yes',
'No',
'No',
'Yes',
'No',
'No',
'No',
'Yes',
'No',
'No',
'No',
'No',
'No',
'No',
'Yes',
'No',
'No'],
'Combined': ['SF22',
'KFS3',
'3FFS',
'Replacement needed',
'LA52',
'K2KA',
'Belt Broke',
'QET6',
'QET6',
'P0SF',
'Testing Broken',
'DP2L',
'SR2F',
'JKO2',
'DP2L',
'A2BF',
'KLL2',
'Light Off',
'A3SA',
'LA52']}
I sorted the dataframe by date and now I would like to create a loop that goes down the table in order row by row. In the loop, if the row in the "Fixed" column is "No", I want to append the value in that row in the "Code" column to a list (which I called list_test). Then, when the row in the "Fixed" column becomes a "Yes", I want to create a new list variable that is a copy of the list_test.
Then, I want to clear the list_test to be an empty list so that it can repeat the process down the column (clearing itself every time there is a "Fix").
In my example table above, I would want the output to be something along the lines of:
- Fixed_Before_3 = ["SF22", "KFS3", "3FFS"]
- Fixed_Before_6 = ["LA52", "K2KA"]
- Fixed_Before_10 = ["QET6", "QET6", "P0SF"]
- Fixed_Before_17 = ["DP2L", "SR2F", "JKO2", "DP2L", "A2BF", "KLL2"]
This is one way I tried to approach the problem:
list_test = []
var_test = {}
for index in df.index:
var_test[index] = "Fixed_Before_" + str(index)
if df['Fixed'][index] == 'No':
list_test.append(df['Code'])
if df['Fixed'][index] == 'Yes':
var_test[index] = list_test
list_test = []
list_test
Although, when I run the code, the output (https://i.sstatic.net/RM0Pj.png) is a very large column and it looks like it includes everything in my column more than a few times rather than the output I included above. I think my problems might be with:
- The way I iterate throughout the dataframe
- The conditional statements about dataframes
- Maybe list_test.append(df['Code']) gives me the whole column instead of the value of the column in the row in my conditional statement?
- Using a dictionary to create new variables in my loop.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
确切的预期输出尚不清楚,但这是一个建议:
输出:
The exact expected output is unclear, but here is a suggestion:
Output: