Python:CSV按列而不是行写入

发布于 2024-10-02 01:14:49 字数 538 浏览 3 评论 0原文

我有一个 python 脚本,它在 while 循环中生成一堆数据。我需要将此数据写入 CSV 文件,因此它按列而不是按行写入。

例如,在我的脚本的循环 1 中,我生成:

(1, 2, 3, 4)

我需要它在我的 csv 脚本中反映出来,如下所示:

Result_1    1
Result_2    2
Result_3    3
Result_4    4

在我的第二个循环中,我生成:

(5, 6, 7, 8)

我需要它在我的 csv 文件中查找,如下所示:

Result_1    1    5
Result_2    2    6
Result_3    3    7
Result_4    4    8

依此类推,直到 while 循环完成。有人可以帮助我吗?


编辑

while 循环可以持续超过 100,000 次循环

I have a python script that generates a bunch of data in a while loop. I need to write this data to a CSV file, so it writes by column rather than row.

For example in loop 1 of my script I generate:

(1, 2, 3, 4)

I need this to reflect in my csv script like so:

Result_1    1
Result_2    2
Result_3    3
Result_4    4

On my second loop i generate:

(5, 6, 7, 8)

I need this to look in my csv file like so:

Result_1    1    5
Result_2    2    6
Result_3    3    7
Result_4    4    8

and so forth until the while loop finishes. Can anybody help me?


EDIT

The while loop can last over 100,000 loops

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

∞梦里开花 2024-10-09 01:14:49

csv 不支持的原因是大多数文件系统并不真正支持可变长度行。相反,您应该做的是收集列表中的所有数据,然后对它们调用 zip() 来转置它们。

>>> l = [('Result_1', 'Result_2', 'Result_3', 'Result_4'), (1, 2, 3, 4), (5, 6, 7, 8)]
>>> zip(*l)
[('Result_1', 1, 5), ('Result_2', 2, 6), ('Result_3', 3, 7), ('Result_4', 4, 8)]

The reason csv doesn't support that is because variable-length lines are not really supported on most filesystems. What you should do instead is collect all the data in lists, then call zip() on them to transpose them after.

>>> l = [('Result_1', 'Result_2', 'Result_3', 'Result_4'), (1, 2, 3, 4), (5, 6, 7, 8)]
>>> zip(*l)
[('Result_1', 1, 5), ('Result_2', 2, 6), ('Result_3', 3, 7), ('Result_4', 4, 8)]
客…行舟 2024-10-09 01:14:49
wr.writerow(item)  #column by column
wr.writerows(item) #row by row

如果您的目标只是逐列写入输出,那么这非常简单。

如果您的项目是列表:

yourList = []

with open('yourNewFileName.csv', 'w', ) as myfile:
    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
    for word in yourList:
        wr.writerow([word])
wr.writerow(item)  #column by column
wr.writerows(item) #row by row

This is quite simple if your goal is just to write the output column by column.

If your item is a list:

yourList = []

with open('yourNewFileName.csv', 'w', ) as myfile:
    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
    for word in yourList:
        wr.writerow([word])
拍不死你 2024-10-09 01:14:49

大多数文件系统不支持更新文件中的行(文件中的行只是一些以换行符结尾的数据,下一行紧随其后开始)。

在我看来,你有两个选择:

  1. 让你的数据生成循环成为生成器,这样它们就不会消耗大量内存 - 你将“及时”获取每行的数据
  2. 使用数据库(sqlite?)更新那里的行。完成后 - 导出到 CSV

第一种方法的小示例:

from itertools import islice, izip, count
print list(islice(izip(count(1), count(2), count(3)), 10))

这也会打印

[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10), (9, 10, 11), (10, 11, 12)]

即使 count 生成无限的数字序列,

Updating lines in place in a file is not supported on most file system (a line in a file is just some data that ends with newline, the next line start just after that).

As I see it you have two options:

  1. Have your data generating loops be generators, this way they won't consume a lot of memory - you'll get data for each row "just in time"
  2. Use a database (sqlite?) and update the rows there. When you're done - export to CSV

Small example for the first method:

from itertools import islice, izip, count
print list(islice(izip(count(1), count(2), count(3)), 10))

This will print

[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10), (9, 10, 11), (10, 11, 12)]

even though count generate an infinite sequence of numbers

若有似无的小暗淡 2024-10-09 01:14:49

假设 (1) 您没有足够的内存 (2) 列表中有行标题 (3) 所有数据值都是浮点数;如果它们都是 32 位或 64 位以下的整数,那就更好了。

在 32 位 Python 上,将浮点数存储在列表中需要 16 个字节(对于浮点对象)和 4 个字节(对于列表中的指针);总共 20。 在 array.array('d') 中存储一个浮点数只需要 8字节。如果所有数据都是 int(有什么缺点吗?),并且适合 8、4、2 或 1 个字节,那么可以节省越来越多的费用——特别是在所有 int 都是 long 的最新 Python 上。

以下伪代码假设浮点数存储在 array.array('d') 中。如果你确实没有内存问题,你仍然可以使用这个方法;如果您想使用列表,我已添加注释来指示所需的更改。

# Preliminary:
import array # list: delete
hlist = []
dlist = []
for each row: 
    hlist.append(some_heading_string)
    dlist.append(array.array('d')) # list: dlist.append([])
# generate data
col_index = -1
for each column:
    col_index += 1
    for row_index in xrange(len(hlist)):
        v = calculated_data_value(row_index, colindex)
        dlist[row_index].append(v)
# write to csv file
for row_index in xrange(len(hlist)):
    row = [hlist[row_index]]
    row.extend(dlist[row_index])
    csv_writer.writerow(row)

Let's assume that (1) you don't have a large memory (2) you have row headings in a list (3) all the data values are floats; if they're all integers up to 32- or 64-bits worth, that's even better.

On a 32-bit Python, storing a float in a list takes 16 bytes for the float object and 4 bytes for a pointer in the list; total 20. Storing a float in an array.array('d') takes only 8 bytes. Increasingly spectacular savings are available if all your data are int (any negatives?) that will fit in 8, 4, 2 or 1 byte(s) -- especially on a recent Python where all ints are longs.

The following pseudocode assumes floats stored in array.array('d'). In case you don't really have a memory problem, you can still use this method; I've put in comments to indicate the changes needed if you want to use a list.

# Preliminary:
import array # list: delete
hlist = []
dlist = []
for each row: 
    hlist.append(some_heading_string)
    dlist.append(array.array('d')) # list: dlist.append([])
# generate data
col_index = -1
for each column:
    col_index += 1
    for row_index in xrange(len(hlist)):
        v = calculated_data_value(row_index, colindex)
        dlist[row_index].append(v)
# write to csv file
for row_index in xrange(len(hlist)):
    row = [hlist[row_index]]
    row.extend(dlist[row_index])
    csv_writer.writerow(row)
烟柳画桥 2024-10-09 01:14:49

Result_* 怎么样,在循环中也会生成(因为我认为不可能添加到 csv 文件)

我会这样;一次旋转矩阵生成所有数据并写入文件:

A = []

A.append(range(1, 5))  # an Example of you first loop

A.append(range(5, 9))  # an Example of you second loop

data_to_write = zip(*A)

# then you can write now row by row

what about Result_* there also are generated in the loop (because i don't think it's possible to add to the csv file)

i will go like this ; generate all the data at one rotate the matrix write in the file:

A = []

A.append(range(1, 5))  # an Example of you first loop

A.append(range(5, 9))  # an Example of you second loop

data_to_write = zip(*A)

# then you can write now row by row
你怎么敢 2024-10-09 01:14:49

按行读取它,然后在命令行中转置它。如果您使用的是 Unix,请安装 csvtool 并按照以下说明进行操作:https://unix.stackexchange.com/a/314482 /186237

Read it in by row and then transpose it in the command line. If you're using Unix, install csvtool and follow the directions in: https://unix.stackexchange.com/a/314482/186237

偷得浮生 2024-10-09 01:14:49

zip 只会采用等于最短长度列表的元素数量。如果您的列长度相等,则需要使用 zip_longest

import csv
from itertools import zip_longest

data = [[1,2,3,4],[5,6]]
columns_data = zip_longest(*data)

with open("file.csv","w") as f:
    writer = csv.writer(f)
    writer.writerows(columns_data)

zip will only take number of elements equal to the shortest length list. If your columns are of equal length, you need to use zip_longest

import csv
from itertools import zip_longest

data = [[1,2,3,4],[5,6]]
columns_data = zip_longest(*data)

with open("file.csv","w") as f:
    writer = csv.writer(f)
    writer.writerows(columns_data)
执着的年纪 2024-10-09 01:14:49

作为替代流方法:

  • 将每个列转储到文件中,
  • 使用 python 或 unix 粘贴命令在选项卡、csv 等上重新加入。

这两个步骤都应该可以很好地处理蒸制问题。

陷阱:

  • 如果您有 1000 个列,您可能会遇到 unix 文件句柄限制!

As an alternate streaming approach:

  • dump each col into a file
  • use python or unix paste command to rejoin on tab, csv, whatever.

Both steps should handle steaming just fine.

Pitfalls:

  • if you have 1000s of columns, you might run into the unix file handle limit!
少年亿悲伤 2024-10-09 01:14:49

经过一段时间的思考,我想出了一种更简单的方法来实现相同的目标。假设您有如下代码:

fruitList = ["Mango", "Apple", "Guava", "Grape", "Orange"]
vegList = ["Onion", "Garlic", "Shallot", "Pumpkin", "Potato"]
with open("NEWFILE.csv", "w") as csvfile:
    writer = csv.writer(csvfile)
    for value in range(len(fruitList)):
        writer.writerow([fruitList[value], vegList[value]])

After thinkering for a while i was able to come up with an easier way of achieving same goal. Assuming you have the code as below:

fruitList = ["Mango", "Apple", "Guava", "Grape", "Orange"]
vegList = ["Onion", "Garlic", "Shallot", "Pumpkin", "Potato"]
with open("NEWFILE.csv", "w") as csvfile:
    writer = csv.writer(csvfile)
    for value in range(len(fruitList)):
        writer.writerow([fruitList[value], vegList[value]])
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文