如何搜索选定的列并查找匹配值

发布于 2025-02-13 01:23:06 字数 1058 浏览 1 评论 0原文

从带有一个工作表的输入XLX中，我正在收集\采矿数据，并在同一工作簿中创建\填充其他工作表。工作表包含多个列，并且每个工作表都会有所不同，因此没有固定数量的列。我想搜索带有标题=用户名的选择列，以匹配值并复制\创建一个带有常见\匹配值的新列。例如：

usere	userf	userf	userh userh	useri
group1	group2	groupa	groupc	groupc
groupa	group2	group2 group3	groupf	groupg
groupg	group3 groupc groupc group2 group2 group2	groupf	group2	group2
将	groupg	groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg	“	匹配”列为同一工作表，并且具有以下的匹配值，因为这些值是所有匹配值，这些值

是所有的匹配值用户有。

匹配
Group2
GroupA

我目前正在使用Collections，CSV和OpenPyXl来生成想要进一步矿山的数据。

原文

From an input xlxs with one worksheet, I am collecting\mining data and creating\populating other worksheets in the same workbook. The worksheets contain multiple columns and will vary on each worksheet, so no fixed numbers of columns. I would like to search select columns with headers=username, for matching value and copy\create a new column with the common\matching values. For example:

UserE	UserF	UserG	UserH	UserI
Group1	Group2	GroupA	Group1	GroupC
GroupA	GroupA	Group2	GroupA	GroupA
Group3	GroupC	GroupB	Group2	Group5
Group2	GroupF	GroupF	GroupG	Group2

will add a column 'match' to same worksheet and have below because those are the matching values that all the user have.

match
Group2
GroupA

I'm currently using Collections, csv,and openpyxl to produce the data that want to further mine.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

凉世弥音 2025-02-20 01:23:06

您可以使用以下代码执行此操作。请注意，该程序将读取input.xlsx文件sheet1选项卡中的数据框中，并假设第一行是标头，然后找到每列中常见的文本，并添加右侧的一列（最后一列之后的一个单元格），带有单词匹配，然后是另一个下方的公共字符串。它不会触摸/替换Excel表中可能存在的其他选项卡。如果没有通用的字符串，它将仅将匹配作为列标题，并且下面没有数据。

import pandas as pd
from openpyxl import load_workbook

# Read data
df = pd.read_excel('input.xlsx', sheet_name='Sheet1')

#For each column, find items that are common across all those columns
for i in range(df.shape[1]):
    if i == 0:
        match = set(df.iloc[:,i])
    else:
        match = match & set(df.iloc[:,i])

#Open using load_workbook so that no data is overwritten        
wb=load_workbook('input.xlsx')
ws=wb['Sheet1']
ws.cell(row=1,column=df.shape[1]+1).value = "match" #Add header row
for r_idx in range(len(match)):  #Add each item to last column below "match"
    ws.cell(row=r_idx+2, column=df.shape[1]+1).value=list(match)[r_idx]
#for c_idex, col in enumerate()
wb.save('input.xlsx')

You can do this with the below code. Note that the program will read input.xlsx file Sheet1 tab into a dataframe with assumption that the first row is the header, then find text that is common in each column and add a column to the right (one cell after the last column) with the word match and then the common strings one below the other. It will not touch/replace other tabs that might be present in the excel sheet. If there are no common strings, it will just have match as column header and no data below it.

import pandas as pd
from openpyxl import load_workbook

# Read data
df = pd.read_excel('input.xlsx', sheet_name='Sheet1')

#For each column, find items that are common across all those columns
for i in range(df.shape[1]):
    if i == 0:
        match = set(df.iloc[:,i])
    else:
        match = match & set(df.iloc[:,i])

#Open using load_workbook so that no data is overwritten        
wb=load_workbook('input.xlsx')
ws=wb['Sheet1']
ws.cell(row=1,column=df.shape[1]+1).value = "match" #Add header row
for r_idx in range(len(match)):  #Add each item to last column below "match"
    ws.cell(row=r_idx+2, column=df.shape[1]+1).value=list(match)[r_idx]
#for c_idex, col in enumerate()
wb.save('input.xlsx')

回复收藏 0 原文

~没有更多了~