Excel 在 CSV 导出中添加额外的引号
我最近创建了一个通过 CSV 将项目添加到数据库的应用程序。添加项目后,我意识到我的很多值都有不需要的额外引号 ("),这打乱了我的排序。
问题是,当从 Excel 导出到 CSV 时,Excel 会为我的所有值添加额外的引号我已经在下面显示了差异:
原始商品:Drill Electric Reversible 1/2" 6.3A
出口商品:“Drill Electric Reversible 1/2”” 6.3"
注意:CSV 导出会添加三 (3) 个额外引号 (")。两个在末尾,一个在原始预期引号之后。
有我可以更改的设置吗? ,或者我可以在 Excel 文件/列上设置的格式属性,或者我是否必须忍受它并在将它们添加到数据库之前在后端代码中删除这些引号?
I've recently created an application which adds items to a Database by CSV. After adding items I realized that lots of my values had extra quotes (") that weren't needed and this was messing up my ordering.
The problem is that when exporting to a CSV from Excel, Excel adds extra quotes to all of my values that already have a quote in them. I've shown the difference below:
Original Item: Drill Electric Reversible 1/2" 6.3A
Exported Item: "Drill Electric Reversible 1/2"" 6.3"
Note: the CSV export is adding three (3) extra quotes ("). Two on the ends, and one after the original intended quote.
Is there a setting I can change, or a formatting property I can set on the Excel File/Column? Or do I have to live with it and remove these quotes in my back-end code before adding them to the Database?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
这是完全正常的。添加外部引号是因为这是一个字符串。内部引号加倍以逃避它。例如,您会在 SQL 查询中看到同样的事情。使用 TextFieldParser 类可以让框架代码自动为您处理此解析。
This is entirely normal. The outer quotes are added because this is a string. The inner quote is doubled to escape it. Same kind of thing you'd see in a SQL query for example. Use the TextFieldParser class to have tried and true framework code care of the parsing of this for you automatically.
这是标准的。
CSV 文件中的值应带有引号(否则字段内的逗号和换行符可能会被误解)。
在字段中转义引号的方法是将其加倍,正如您所看到的那样。
我建议您阅读CSV 的基本规则:
(强调我的)
That's standard.
The values within a CSV file should have quotes around them (otherwise commas and linebreaks inside a field may be misinterpreted).
The way to escape a quote within a field is to double it, just as you are seeing.
I suggest you read about the basic rules of CSV:
(emphasis mine)
另一种方法是
在 Excel 数据中使用 Unicode 字符“DOUBLE PRIME”。要将 Excel 导出为 UTF-8 或 UTF-16 .csv,您必须提供具有适当的 CharacterSet 属性的 schema.ini。显然,用于将 .csv 导入数据库的工具也必须支持 Unicode。
根据 DBMS,可以使用更直接的数据传输方式 (SELECT/INSERT ... INTO ... IN ),从而完全消除 .csv。
Another approach would be to use the Unicode Character "DOUBLE PRIME"
in your Excel data. To export from Excel into a UTF-8 or UTF-16 .csv you'll have to provide a schema.ini with an appropriate CharacterSet property. Obviously, the tool you use to import the .csv into your database has to be Unicode aware too.
Depending on the DBMS a more direct way of data transfer (SELECT/INSERT ... INTO ... IN ) can be used, thereby eliminating the .csv entirely.
您可以尝试从 Excel 导出为 TAB 分隔文件。我发现它更容易解析。
You could try exporting from Excel as TAB delimited files. I find it easier to parse.
将所有右双引号字符替换为左双引号字符。它们看起来很相似,Excel 会混淆并让文本不变。
Replace all characters Right Double Quotation Mark by characters Left Double Quotation Mark. They look similar, Excel will be confused and let the text unchanged.
仅当您的最终输出是 HTML 时,此解决方案才会有帮助。这是 javascript 解决方案,因此显然您需要使用 C# 或您正在使用的任何语言重做此操作:
在解析 CSV 之前应用此操作。
This solution will only help if your end output is HTML. This is the javascript solution so obviously you'll need to redo this in C# or whichever language you're working in:
Apply this before you parse the CSV.