是否可以在不了解其余部分的情况下更新 CSV 文件的一部分?
在我正在处理的项目中,我需要读取 CSV 文件,更新每行上的字段,然后将结果保存回 CSV 文件。我正在寻找一个可以帮助我解决这个问题的图书馆。
我的第一次尝试是使用 ADO。这对阅读来说就像一个魅力,但当我尝试更新文件时,我收到错误“此 ISAM 不支持更新链接表中的数据”。
所以现在我正在寻找替代品(或解决方法)。这些是我的要求:
我不想定义文件中的每一列。我只需要两列,并且担心将来可能会添加其他列。
我需要能够保留(或者至少复制)列标题信息。
我希望对底层格式/文件了解尽可能少(即我不想从头开始编写 CSV 编写器)。
我遇到过许多替代阅读器和几个编写器...但是编写器都涉及将 CSV 文件读入一组预定义的字段,然后仅将这些字段写回新文件。我想最大限度地减少硬编码到程序中的列结构的信息量。
In a project I'm working on, I need to read from a CSV file, update a field on each row, and then save the results back to the CSV file. I'm looking for a library that will help me with this.
My first attempt was to use ADO. This worked like a charm for reading, but when I attempted to update the file I received the error "Updating data in a linked table is not supported by this ISAM."
So now I'm looking for a replacement (or workaround). These are my requirements:
I would rather not define every column in the file. I only need two columns, and am concerned that additional columns may be added at a future date.
I need to be able to preserve (or, at the very least, replicate) the column heading information.
I would prefer to have as little knowledge of the underlying format/file as possible (i.e. I don't want to write a CSV writer from scratch).
I've run across a number of alternative readers, and a couple of writers... But the writers all involve reading the CSV file into a pre-defined set of fields, and then writing only those fields back to the new file. I want to minimize the amount of information about the column structure hard coded into my program.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
根据现代文件系统的工作方式,如果新数据的大小与原始数据完全相同,您只能就地更新任何文件。否则,您必须从头开始重写整个文件。如果您可以满足此约束,则可以使用低级文件流来实现。我不知道有哪个 csv 包支持这个功能,但原因是 csv 足够简单,您可以自己完成。
也就是说,如果您无论如何都要更新每一行,那么重写文件可能没什么大不了的。编写 csv 记录非常简单。观察下面的 C# 代码:
当然,如果您有复杂的类型并且想要更加挑剔,那没问题,但是由于您不想将自己限制在特定的未来列安排上,因此该代码应该没问题。此外,它将补充我自己的 Stack Overflow 上列出的 CSV 解析器 ,不需要预先了解文件中的列。你可以这样做:
The way modern file systems work you can only update any file in place if the new data is the exact same size as the original. Otherwise you must re-write the entire file from scratch. If you can meet this constraint, you can do it with low-level file streams. I don't know of a csv package that supports this off the top of my head, but the reason for this is that csv is simple enough you can do it on your own.
That said, if you are updating every row anyway then re-writing the file probably isn't that big of a deal. Writing a csv record is dead simple. Observe the following C# code:
Of course, if you have complex types that you want to be more picky about, that's fine, but since you don't want to constrain yourself to specific future column arrangements this code should be just fine. Additionally, it will complement my own CSV parser listed here on Stack Overflow, which does not require advance knowledge of the columns in the file. You could do something like this:
.csv 文件是一个平面文件,据我所知,除非您有一些超出我过去的东西(例如索引文件系统),否则您无法就地更新文件。
建议将 .csv 文件读入您的程序中。
将其存储到 SQL Lite 等数据库中或足够的堆内存中以保存文件的长度以及一些额外的空间来保存更改。
做出改变。
写出文件。
我省略了很多关于如何进行更新的细节。希望具有丰富文件经验的人能够纠正我对您应该做什么的想法。可能某些微软数据库对象库可以做到这一点,但我对它们不熟悉。
A .csv file is a flat file, and as far as I know you cannot update a file in place unless you have something way out of my past, like an indexed file system.
Suggest reading your .csv file into your program.
Store it into a database like SQL Lite or enough heap memory to hold the length of the file and some additional space to hold changes.
Make changes.
Write file out.
I've left out lots of details as to how you would do the update. Hopefully someone with lots of file experience can correct my notion of what you should do. It may be that some Microsoft database object libraries can do this, but I am not familiar with them.
如果您使用 C#4.0,File 类中有一些不错的扩展,即使您无法锁定单行,也可以帮助您重写 csv 文件。您应该查看 File.ReadLines和File.WriteAllLines。它们都采用 IEnumerable 作为参数,因此您可以在每行的基础上执行转换。尽管这并不意味着您没有锁定文件,但肯定比将整个文件放在内存中占用的内存更少。
编辑:如果您正在寻找一种解析 csv 行的快速方法,可以使用 这个正则表达式可以为你解决问题。
If you are using C#4.0 there are some nice extensions in the File class that can help you rewrite your csv file even if you couldn't lock a single row. You should take a look at File.ReadLines and File.WriteAllLines. They both take an IEnumerable as a parameter, so you could perform your conversion on a per-line basis. Even though this doesn't mean that you are not locking your file, is certainly less memory intensive than having the whole file in memory.
EDIT: if you are looking for a quick way of parsing your csv line there is this regex expression that can do the trick for you.
CSV 文件是无类型的字符流,因此可以替换单个字符,但如果不重写整个文件,则无法添加或删除字符。
根据个人经验,我强烈建议按照 @Joel 的建议创建一个简单的 CSV 解析器,并为每次更新重新创建整个文件。如果出现错误,尝试就地更新数据很容易损坏整个文件且无法恢复。
请按照以下步骤操作:
CSV files are untyped streams of characters so it is possible to replace individual characters, but you cannot add or remove characters without rewriting the entire file.
I strongly suggest, from personal experience, to instead create a simple CSV parser as @Joel suggests and recreate the entire file for each update. Trying to update data in-place can easily corrupt your entire file irrecoverably if there is an error.
Follow this procedure: