csv 行放入单独的 txt 文件中?

发布于 2024-12-09 09:47:40 字数 944 浏览 0 评论 0原文

任务 1:将一个 csv 文件中的每一行读取到一个单独的 txt 文件中。

任务 2:反向操作:在一个文件夹中,从每个 txt 文件中读取文本并放入单个 csv 中的一行中。因此,将所有 txt 文件读入一个 csv 文件。

你会怎么做? Java 还是 Python 更适合快速完成这项任务?

更新: 对于 Java,已经有一些非常有用的库可以使用,例如 opencsvjavacsv。但如果对 csv 没有了解,最好看看关于 csv 的维基百科。这篇文章告诉您 Java 中的所有可能性。


注意:由于问题很简单,有人预设这是一项作业。我特此声明,事实并非如此。

更多背景:我正在开展自己的机器学习实验并建立大规模测试集。我需要抓取、抓取和文件类型传输作为实验的基本实用程序。现在自己构建了很多东西,由于最近的一些发现,突然想学习Python,并且感觉Python在许多解析和文件处理情况下比Java更简洁。于是就有了这个问题。

我只是想通过直接了解要点来节省你我的时间,而不需要说明不太相关的背景。我的问题更多是关于第二个问题“Java vs Python”。因为我使用一些 csv 库遇到了几行 Python 代码(?不确定,这就是我问的原因),但只是不知道如何使用 Python。这就是我提出这个问题的所有原因。谢谢。

Task 1: Read each row from one csv file into one seprate txt file.

Task 2: Reverse: in one folder, read text from each txt file and put into a row in a single csv. So, read all txt files into one csv file.

How would you do this? Would Java or Python be good to get this task done in very quickly?

Update:
For Java, there are already some quite useful libraries you can use, for example opencsv or javacsv. But better have a look at wikipedia about csv if no knowledge on csv. And this post tells you all the possibilities in Java.


Note: Due to the simplicity of the question, some one pre-assume this is a homework. I hereby declare it is not.

More background: I am working on my own experiments on machine learning and setting up a large scale test set. I need crawl, scrape and file type transfer as the basic utility for the experiment. Building a lot of things by myself for now, and suddenly want to learn Python due to some recent discoveries and get the feeling Python is more concise than Java for many parsing and file handling situations. Hence got this question.

I just want to save time for both you and me by getting to the gist without stating the not-so-related background. And my questions is more about the second question "Java vs Python". Because I run into few lines of code of Python using some csv library (? not sure, that's why I asked), but just do not know how to use Python. That are all the reasons why I got this question. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

画骨成沙 2024-12-16 09:47:40

从您所写的内容来看,几乎不需要使用特定于 CSV 文件的内容。特别是对于任务 1,这是对文本文件的纯数据 I/O 操作。以 Python 为例:

for i,l in enumerate(open(the_file)):
   f = open('new_file_%i.csv' % i, 'w')
   f.write(l)
   f.close()

对于任务 2,如果您可以保证每个文件具有相同的结构(每行的字段数相同),那么它又是一个纯数据 I/O 操作:

# glob files
files = glob('file_*.csv')
target = open('combined.csv', 'w')
for f in files:
   target.write(open(f).read())
   target.write(new_line_speparator_for_your_platform)
target.close()

您是否在 Java 或 Python 中执行此操作取决于仅在目标系统上的可用性和您的个人偏好。

From what you write there is little need on using something specific for CSV files. In particular for Task 1, this is a pure data I/O operation on text files. In Python for instance:

for i,l in enumerate(open(the_file)):
   f = open('new_file_%i.csv' % i, 'w')
   f.write(l)
   f.close()

For Task 2, if you can guarantee that each file has the same structure (same number of fields per row) it is again a pure data I/O operation:

# glob files
files = glob('file_*.csv')
target = open('combined.csv', 'w')
for f in files:
   target.write(open(f).read())
   target.write(new_line_speparator_for_your_platform)
target.close()

Whether you do this in Java or Python depends on the availability on the target system and your personal preference only.

淡忘如思 2024-12-16 09:47:40

在这种情况下,我会使用 python,因为它通常比 Java 更简洁。
另外,CSV 文件很容易用 Python 处理,无需安装任何东西。我不知道Java。

任务 1

根据官方文档中的示例,大致如下:

import csv
with open('some.csv', 'r') as f:
    reader = csv.reader(f)
    rownumber = 0
    for row in reader:
        g=open("anyfile"+str(rownumber)+".txt","w")
        g.write(row)
        rownumber = rownumber + 1
        g.close()

任务 2

f = open("csvfile.csv","w")
dirList=os.listdir(path)
for fname in dirList:
    if fname[-4::] == ".txt":
       g = open("fname")
       for line in g: f.write(line)
       g.close
f.close()

In that case I would use python since it is often more concise than Java.
Plus, the CSV files are really easy to handle with Python without installing something. I don't know for Java.

Task 1

It would roughly be this based on an example from the official documentation:

import csv
with open('some.csv', 'r') as f:
    reader = csv.reader(f)
    rownumber = 0
    for row in reader:
        g=open("anyfile"+str(rownumber)+".txt","w")
        g.write(row)
        rownumber = rownumber + 1
        g.close()

Task 2

f = open("csvfile.csv","w")
dirList=os.listdir(path)
for fname in dirList:
    if fname[-4::] == ".txt":
       g = open("fname")
       for line in g: f.write(line)
       g.close
f.close()
娇女薄笑 2024-12-16 09:47:40

在Python中,
任务 1:

import csv
with open('file.csv', 'rb') as df:
    reader = csv.reader(df)
    for rownumber, row in enumerate(reader):
        with open(''.join(str(rownumber),'.txt') as f:
            f.write(row)

任务 2:

from glob import glob
with open('output.csv', 'wb') as output:
    for f in glob('*.txt'):
        with open(f) as myFile:
            rows = myFile.readlines()
            output.write(rows)

您需要根据您的用例调整这些。

In python,
Task 1:

import csv
with open('file.csv', 'rb') as df:
    reader = csv.reader(df)
    for rownumber, row in enumerate(reader):
        with open(''.join(str(rownumber),'.txt') as f:
            f.write(row)

Task 2:

from glob import glob
with open('output.csv', 'wb') as output:
    for f in glob('*.txt'):
        with open(f) as myFile:
            rows = myFile.readlines()
            output.write(rows)

You will need to adjust these for your use cases.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文