搜索与将csv文件中的值与python进行比较

发布于 2024-10-24 15:40:39 字数 1527 浏览 1 评论 0原文

我有两个 csv 文件 - 一个 master 和一个 csv 文件。一个更新文件。我想从更新文件中获取特定列,&检查主设备的值。

两个文件将具有相同的列和内容。应该大致如下所示:

Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)  
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,  
C.P. Lotus Corporation,________,00121,Current,CHEARAVANONT Soopakij,___,Executive Director,CEO,2000-04-14,  

基本上,我想遍历更新文件,从更新文件中获取每个股票代码值&检查主文件中是否存在。

然后,对于每个匹配的股票代码,我需要检查董事姓名值的差异,并跟踪那些不匹配的代码。

我已经按照这个示例进行操作,但它似乎并没有完全满足我的需要(或者我没有完全理解它......): Python:比较两个 CSV 文件并搜索相似项目

f1 = file(csvHKX, 'rU')
f2 = file(csvWRHK, 'rU')
f3 = file('results.csv', 'w')

csv1 = csv.reader(f1)
csv2 = csv.reader(f2)
csv3 = csv.writer(f3)

scode = [row for row in csv2]

for hkx_row in csv1:
  for wrhk_row in scode:
    if hkx_row[2] != wrhk_row[2]:
      print 'HKX:', hkx_row
    continue

f1.close()
f2.close()
f3.close()

更新文件包含以下股票代码:“00121”和“00121”。 “01003”(用于测试)。

看起来代码正在遍历列表来比较每一行和每行。如果股票代码逐行不匹配,则打印一行。因此,当第一列读取“00121”时,它会打印出包含“01003”和“01003”的行。反之亦然。

但我只对何时在 wrhk_row[2] 中找不到 hkx_row[2] 感兴趣

I have two csv files - a master & an update file. I want take specific columns from the update file, & check the values against the master.

Both files will have the same columns & should look roughly like this:

Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)  
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,  
C.P. Lotus Corporation,________,00121,Current,CHEARAVANONT Soopakij,___,Executive Director,CEO,2000-04-14,  

Basically, I want to traverse the update file, taking each stock code value from the update file & checking to see if it exists in the master file.

Then, for each matching stock code, I need to check for differences in the Director name value, keeping track of those that don't match.

I've followed this example but it doesn't seem to do quite what I need (or i don't fully understand it...): Python: Comparing two CSV files and searching for similar items

f1 = file(csvHKX, 'rU')
f2 = file(csvWRHK, 'rU')
f3 = file('results.csv', 'w')

csv1 = csv.reader(f1)
csv2 = csv.reader(f2)
csv3 = csv.writer(f3)

scode = [row for row in csv2]

for hkx_row in csv1:
  for wrhk_row in scode:
    if hkx_row[2] != wrhk_row[2]:
      print 'HKX:', hkx_row
    continue

f1.close()
f2.close()
f3.close()

The update file contains the following stock codes: '00121' & '01003' (for testing).

It seems like the code is iterating through the lists comparing each line & printing out a line if the stock codes don't match line for line. So when the first column is reading '00121' it's printing out lines containing '01003' & vice versa.

But I am only interested in when it can't find hkx_row[2] ANYWHERE in wrhk_row[2]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

东北女汉子 2024-10-31 15:40:39

这对你有帮助吗? :

文件ma​​ster.csv

Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)  
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,  
C.P. Lotus Corporation,________,00121,Current,CHEARAVANONT Soopakij,___,Executive Director,CEO,2000-04-14,  
C.P. Lotus Corporation,________,00123,Current,DEANINO James,___,Pilot,,2009-06-25,
C.P. Lotus Corporation,________,00129,Current,GINGE Ivy,___,Dental Technician,,2010-07-27,
C.P. Lotus Corporation,________,00127,Current,ERATOR Jane,___,Engineer,,2005-12-04,
C.P. Lotus Corporation,________,00119,Current,FIELD Mary,___,Pastrycook,,2009-06-25,

文件update.csv

Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)  
C.P. Lotus Corporation,________,00133,Current,THOMPSON Sarah,___,Cosmonaut,,2004-01-20,
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,  
C.P. Lotus Corporation,________,00121,Current,BEARD Sophia,___,Executive Director,CEO,2010-04-26,   
C.P. Lotus Corporation,________,00127,Current,ERATOR Jane,___,Engineer,,2005-12-04,
C.P. Lotus Corporation,________,00129,Current,MISTOUKI Hassan,___,Folk Singer,,2010-07-27,

代码

import csv

mas = csv.reader(open('master.csv','rb'))
upd = csv.reader(open('update.csv','rb'))

set24 = set((row[2],row[4]) for row in mas)
print set24
print

updkept = [ row for row in upd if (row[2],row[4]) not in set24]
print '\n'.join(map(str,updkept))

结果

set([('00127', 'ERATOR Jane'), ('00121', 'CHEARAVANONT Soopakij'), ('00121', 'CHEARAVANON Narong'), ('00119', 'FIELD Mary'), ('00122', 'CHEARAVANONT Dhanin'), ('Stock Code', "Director's English Name"), ('00129', 'GINGE Ivy'), ('00123', 'DEANINO James')])

['C.P. Lotus Corporation', '________', '00133', 'Current', 'THOMPSON Sarah', '___', 'Cosmonaut', '', '2004-01-20', '']
['C.P. Lotus Corporation', '________', '00121', 'Current', 'BEARD Sophia', '___', 'Executive Director', 'CEO', '2010-04-26', '   ']
['C.P. Lotus Corporation', '________', '00129', 'Current', 'MISTOUKI Hassan', '___', 'Folk Singer', '', '2010-07-27', '']

Do this help you ? :

file master.csv

Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)  
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,  
C.P. Lotus Corporation,________,00121,Current,CHEARAVANONT Soopakij,___,Executive Director,CEO,2000-04-14,  
C.P. Lotus Corporation,________,00123,Current,DEANINO James,___,Pilot,,2009-06-25,
C.P. Lotus Corporation,________,00129,Current,GINGE Ivy,___,Dental Technician,,2010-07-27,
C.P. Lotus Corporation,________,00127,Current,ERATOR Jane,___,Engineer,,2005-12-04,
C.P. Lotus Corporation,________,00119,Current,FIELD Mary,___,Pastrycook,,2009-06-25,

file update.csv

Listed Company's English Name,Listed Company's Chinese Name,Stock Code,Listing Status,Director's English Name,Director's Chinese Name,Capacity,Position,Appointment Date (yyyy-mm-dd),Resignation Date (yyyy-mm-dd)  
C.P. Lotus Corporation,________,00133,Current,THOMPSON Sarah,___,Cosmonaut,,2004-01-20,
C.P. Lotus Corporation,________,00122,Current,CHEARAVANONT Dhanin,___,Executive Director,,2009-12-31,
C.P. Lotus Corporation,________,00121,Current,CHEARAVANON Narong,___,Executive Director,,2001-02-01,  
C.P. Lotus Corporation,________,00121,Current,BEARD Sophia,___,Executive Director,CEO,2010-04-26,   
C.P. Lotus Corporation,________,00127,Current,ERATOR Jane,___,Engineer,,2005-12-04,
C.P. Lotus Corporation,________,00129,Current,MISTOUKI Hassan,___,Folk Singer,,2010-07-27,

code

import csv

mas = csv.reader(open('master.csv','rb'))
upd = csv.reader(open('update.csv','rb'))

set24 = set((row[2],row[4]) for row in mas)
print set24
print

updkept = [ row for row in upd if (row[2],row[4]) not in set24]
print '\n'.join(map(str,updkept))

result

set([('00127', 'ERATOR Jane'), ('00121', 'CHEARAVANONT Soopakij'), ('00121', 'CHEARAVANON Narong'), ('00119', 'FIELD Mary'), ('00122', 'CHEARAVANONT Dhanin'), ('Stock Code', "Director's English Name"), ('00129', 'GINGE Ivy'), ('00123', 'DEANINO James')])

['C.P. Lotus Corporation', '________', '00133', 'Current', 'THOMPSON Sarah', '___', 'Cosmonaut', '', '2004-01-20', '']
['C.P. Lotus Corporation', '________', '00121', 'Current', 'BEARD Sophia', '___', 'Executive Director', 'CEO', '2010-04-26', '   ']
['C.P. Lotus Corporation', '________', '00129', 'Current', 'MISTOUKI Hassan', '___', 'Folk Singer', '', '2010-07-27', '']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文