使用 Python 从 CSV 文件创建列表

发布于 2024-10-01 08:45:03 字数 2358 浏览 2 评论 0原文

到目前为止，我有一个Python脚本可以完成我的任务...打开用户定义的CSV，将文件分割成不同的预定义“池”，然后将它们重新制作成自己的文件，并带有适当的标题。我唯一的问题是我想将池列表从静态更改为变量；并遇到一些问题。

池列表位于 CSV 本身的第 2 列中，并且可以复制。现在，通过此设置，系统可以创建除标头之外没有任何数据的“死”文件。

一些注释：是的，我知道拼写并不完美，是的，我知道我的一些评论有点偏离

import csv
#used to read ane make CSV's
import time
#used to timestamp files
import tkFileDialog
#used to allow user input
filename = tkFileDialog.askopenfilename(defaultextension = ".csv")
#Only user imput to locate the file it self
csvfile = [] 
#Declairs csvfile as a empty list
pools = ["1","2","4","6","9","A","B","D","E","F","I","K","L","M","N","O","P","W","Y"]
#declairs hte pools list for known pools
for i in pools:
    #uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")
reader = csv.reader(open(filename, "rb"), delimiter = ',')
 #Opens the CSV for the reader to use
for row in reader: 
    csvfile.append(row) 
    #dumps the CSV into a varilable
    headers=[]
    #declairs headers as empty list
    headers.append(csvfile[0])
    #appends the first row to the header variable
for row in csvfile: 
    pool = str(row[1]).capitalize()
    #Checks to make sure all pools in the main data are capitalized
    if pool in pools:
        exec("pool"+pool+".append(row)")
        #finds the pool list and appends the new item into the variable list
    else: 
        pass
for i in pools:
    exec("wp=csv.writer(open('pool "+i+" "+time.strftime("%Y%m%d")+".csv','wb'),)")
    wp.writerows(headers)
    #Adds the header row
    exec("wp.writerows(pool"+i+")")
    #Created the CSV with a timestamp useing the pool list
    #-----Needs Headers writen in on each file -----

编辑：由于存在一些问题，

代码的原因：我正在生成每日报告，其中需要手动过程的部分报告将这些报告拆分为不同的池报告。我正在创建这个脚本，以便我可以快速选择文件本身并将它们快速拆分为自己的文件。

主 CSV 的长度可以是 50 到 100 个项目，总共有 25 列，池总是列在第二列上。并非所有池都会一直列出，并且池会多次显示。

到目前为止，我已经尝试了几种不同的循环；一种如下

pools = [] 对于文件中的行（打开（文件名，'rb'））：线 = 线.split() x = 线[1] pools.append(x)

但我收到一个列表错误。

CSV 的示例：

Ticket Pool Date Column 4 Column 5

1   A   11/8/2010   etc etc

2   A   11/8/2010   etc etc

3   1   11/8/2010   etc etc

4   6   11/8/2010   etc etc

5   B   11/8/2010   etc etc

6   A   11/8/2010   etc etc

7   1   11/8/2010   etc etc

8   2   11/8/2010   etc etc

9   2   11/8/2010   etc etc

10  1   11/8/2010   etc etc

原文

I have a Python script so far that does what I it to... Opens the CSV Defined by the user, splits the file into different Predefined "pools" and remakes them again into their own files, with proper headers. My only problem is I want to change the Pool list from a static to a variable; and having some issues.

The pool list is in the CSV it self, in column 2. and can be duplicated. Right now with this setup the system can create "Dead" Files with no data aside from the header.

A few notes: Yes I know spelling is not perfect and yes I know what some of my comments are a bit off

import csv
#used to read ane make CSV's
import time
#used to timestamp files
import tkFileDialog
#used to allow user input
filename = tkFileDialog.askopenfilename(defaultextension = ".csv")
#Only user imput to locate the file it self
csvfile = [] 
#Declairs csvfile as a empty list
pools = ["1","2","4","6","9","A","B","D","E","F","I","K","L","M","N","O","P","W","Y"]
#declairs hte pools list for known pools
for i in pools:
    #uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")
reader = csv.reader(open(filename, "rb"), delimiter = ',')
 #Opens the CSV for the reader to use
for row in reader: 
    csvfile.append(row) 
    #dumps the CSV into a varilable
    headers=[]
    #declairs headers as empty list
    headers.append(csvfile[0])
    #appends the first row to the header variable
for row in csvfile: 
    pool = str(row[1]).capitalize()
    #Checks to make sure all pools in the main data are capitalized
    if pool in pools:
        exec("pool"+pool+".append(row)")
        #finds the pool list and appends the new item into the variable list
    else: 
        pass
for i in pools:
    exec("wp=csv.writer(open('pool "+i+" "+time.strftime("%Y%m%d")+".csv','wb'),)")
    wp.writerows(headers)
    #Adds the header row
    exec("wp.writerows(pool"+i+")")
    #Created the CSV with a timestamp useing the pool list
    #-----Needs Headers writen in on each file -----

EDIT:
As there have been some questions

Reason for the code: I have Daily reports that are being generated, Part of these reports that require a manual process is splitting these reports into different Pool Reports. I was creating this script so that I could quickly select the file it self and quickly split these out into their own files.

The main CSV can be from 50 to 100 items long, it has a total of 25 Columns and the Pool Is always going to be listed on the second column. Not all Pools will be listed all the time, and pools will show up more then once.

I have tried a few different loops so far; one is as follows

pools = []
for line in file(open(filename,'rb')):
line = line.split()
x = line[1]
pools.append(x)

But I get a List error with this.

A example of the CSV:

Ticket Pool Date Column 4 Column 5

1   A   11/8/2010   etc etc

2   A   11/8/2010   etc etc

3   1   11/8/2010   etc etc

4   6   11/8/2010   etc etc

5   B   11/8/2010   etc etc

6   A   11/8/2010   etc etc

7   1   11/8/2010   etc etc

8   2   11/8/2010   etc etc

9   2   11/8/2010   etc etc

10  1   11/8/2010   etc etc

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

☆獨立☆ 2024-10-08 08:45:03

如果我正确理解您想要在这里实现的目标，这可能是解决方案：

import csv
import time
import tkFileDialog

filename = tkFileDialog.askopenfilename(defaultextension = ".csv")

reader = csv.reader(open(filename, "rb"), delimiter = ',')

headders = reader.next()

pool_dict = {}

for row in reader:
    if not pool_dict.has_key(row[1]):
        pool_dict[row[1]] = []
    pool_dict[row[1]].append(row)
       
for key, val in pool_dict.items():
    wp = csv.writer(open('pool ' +key+ ' '+time.strftime("%Y%m%d")+'.csv','wb'),)
    wp.writerow(headders)
    wp.writerows(val)

编辑：首先误解了标头和池的内容并尝试纠正问题。

编辑 2：更正了要根据文件中找到的值动态创建的池。

如果没有，请提供您的问题的更多详细信息...

If I am understanding correctly what you want to achieve here this could be as solution:

import csv
import time
import tkFileDialog

filename = tkFileDialog.askopenfilename(defaultextension = ".csv")

reader = csv.reader(open(filename, "rb"), delimiter = ',')

headders = reader.next()

pool_dict = {}

for row in reader:
    if not pool_dict.has_key(row[1]):
        pool_dict[row[1]] = []
    pool_dict[row[1]].append(row)
       
for key, val in pool_dict.items():
    wp = csv.writer(open('pool ' +key+ ' '+time.strftime("%Y%m%d")+'.csv','wb'),)
    wp.writerow(headders)
    wp.writerows(val)

EDIT: misunderstood the headers and pools thing in the first place and tried to correct the issue.

EDIT 2: corrected the pool to be dynamically created from values found in file.

If not, please provide more details of your Problem…

回复收藏 0 原文

屌丝范 2024-10-08 08:45:03

您能简单描述一下您的 CSV 文件吗？

一个建议是更改

for i in pools:
#uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")

为更Pythonic的形式：

pool_dict = {}
for i in pools:
    pool_dict[i] = []

一般来说，使用 eval/exec 是不好的，并且更容易说循环字典。例如，通过 pool_dict['A']、pool_dict['1'] 访问变量或循环遍历所有变量，如

for key,val in pool_dict.items():
   val.append(...)

编辑：现在看到 CSV 数据，尝试如下操作：

for row in reader:
    if row[0] == 'Ticket':
        header = row
    else:
        cur_pool = row[1].capitalize()
        if not pool_dict.has_key(cur_pool):
            pool_dict[cur_pool] = [row,]
        else:
            pool_dict[cur_pool].append(row)

for p, pool_vals in pool_dict.items:
    with open('pool'+p+'_'+time.strftime("%Y%m%d")+'.csv','wb'),) as fp:
        wp = csv.writer(fp)
        wp.writerow(header)
        wp.writerows(pool_vals)

Can you describe your CSV file a little bit?

One suggestion is to change

for i in pools:
#uses the Pools List and makes a large number of variables
    exec("pool"+i+"=[]")

to the more pythonic form:

pool_dict = {}
for i in pools:
    pool_dict[i] = []

In general its bad to using eval/exec and much easier to say loop through a dictionary. E.g., access variables by pool_dict['A'], pool_dict['1'] or loop through all of them like

for key,val in pool_dict.items():
   val.append(...)

EDIT: Now seeing the CSV data, try something like this:

for row in reader:
    if row[0] == 'Ticket':
        header = row
    else:
        cur_pool = row[1].capitalize()
        if not pool_dict.has_key(cur_pool):
            pool_dict[cur_pool] = [row,]
        else:
            pool_dict[cur_pool].append(row)

for p, pool_vals in pool_dict.items:
    with open('pool'+p+'_'+time.strftime("%Y%m%d")+'.csv','wb'),) as fp:
        wp = csv.writer(fp)
        wp.writerow(header)
        wp.writerows(pool_vals)

回复收藏 0 原文

巨坚强 2024-10-08 08:45:03

如果没有这些执行人员，您的代码会更容易阅读。看起来您使用它们来声明所有变量，而实际上您可以声明这样的池列表：

pool_lists = [[] for p in pools]

这是我对“我想将池列表从静态更改为变量”的意思的最佳猜测”。当你这样做时，你将得到一个列表的列表，其长度与池相同。

You code would be a lot easier to read without all those execs. It seems like you used them to declare all of your variables, when in fact you could declare a list of pools like this:

pool_lists = [[] for p in pools]

This is my best guess for what you mean by "I want to change the Pool list from a static to a variable." When you do this, you will have a list of lists, of the same length as pools.

回复收藏 0 原文

~没有更多了~