如何搜索选定的列并查找匹配值

发布于 2025-02-13 01:23:06 字数 1058 浏览 1 评论 0原文

从带有一个工作表的输入XLX中,我正在收集\采矿数据,并在同一工作簿中创建\填充其他工作表。工作表包含多个列,并且每个工作表都会有所不同,因此没有固定数量的列。我想搜索带有标题=用户名的选择列,以匹配值并复制\创建一个带有常见\匹配值的新列。例如:

usereuserfuserfuserh userhuseri
group1group2groupagroupcgroupc
groupagroup2group2 group3groupfgroupg
groupggroup3 groupc groupc group2 group2 group2groupfgroup2group2
groupggroupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg groupg匹配”列为同一工作表,并且具有以下的匹配值,因为这些值是所有匹配值,这些值

是所有的匹配值用户有。

匹配
Group2
GroupA

我目前正在使用Collections,CSV和OpenPyXl来生成想要进一步矿山的数据。

From an input xlxs with one worksheet, I am collecting\mining data and creating\populating other worksheets in the same workbook. The worksheets contain multiple columns and will vary on each worksheet, so no fixed numbers of columns. I would like to search select columns with headers=username, for matching value and copy\create a new column with the common\matching values. For example:

UserEUserFUserGUserHUserI
Group1Group2GroupAGroup1GroupC
GroupAGroupAGroup2GroupAGroupA
Group3GroupCGroupBGroup2Group5
Group2GroupFGroupFGroupGGroup2

will add a column 'match' to same worksheet and have below because those are the matching values that all the user have.

match
Group2
GroupA

I'm currently using Collections, csv,and openpyxl to produce the data that want to further mine.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

凉世弥音 2025-02-20 01:23:06

您可以使用以下代码执行此操作。请注意,该程序将读取input.xlsx文件sheet1选项卡中的数据框中,并假设第一行是标头,然后找到每列中常见的文本,并添加右侧的一列(最后一列之后的一个单元格),带有单词匹配,然后是另一个下方的公共字符串。它不会触摸/替换Excel表中可能存在的其他选项卡。如果没有通用的字符串,它将仅将匹配作为列标题,并且下面没有数据。

import pandas as pd
from openpyxl import load_workbook

# Read data
df = pd.read_excel('input.xlsx', sheet_name='Sheet1')

#For each column, find items that are common across all those columns
for i in range(df.shape[1]):
    if i == 0:
        match = set(df.iloc[:,i])
    else:
        match = match & set(df.iloc[:,i])

#Open using load_workbook so that no data is overwritten        
wb=load_workbook('input.xlsx')
ws=wb['Sheet1']
ws.cell(row=1,column=df.shape[1]+1).value = "match" #Add header row
for r_idx in range(len(match)):  #Add each item to last column below "match"
    ws.cell(row=r_idx+2, column=df.shape[1]+1).value=list(match)[r_idx]
#for c_idex, col in enumerate()
wb.save('input.xlsx')

You can do this with the below code. Note that the program will read input.xlsx file Sheet1 tab into a dataframe with assumption that the first row is the header, then find text that is common in each column and add a column to the right (one cell after the last column) with the word match and then the common strings one below the other. It will not touch/replace other tabs that might be present in the excel sheet. If there are no common strings, it will just have match as column header and no data below it.

import pandas as pd
from openpyxl import load_workbook

# Read data
df = pd.read_excel('input.xlsx', sheet_name='Sheet1')

#For each column, find items that are common across all those columns
for i in range(df.shape[1]):
    if i == 0:
        match = set(df.iloc[:,i])
    else:
        match = match & set(df.iloc[:,i])

#Open using load_workbook so that no data is overwritten        
wb=load_workbook('input.xlsx')
ws=wb['Sheet1']
ws.cell(row=1,column=df.shape[1]+1).value = "match" #Add header row
for r_idx in range(len(match)):  #Add each item to last column below "match"
    ws.cell(row=r_idx+2, column=df.shape[1]+1).value=list(match)[r_idx]
#for c_idex, col in enumerate()
wb.save('input.xlsx')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文