如何强制 Ruby 的 CSV 输出中的一个字段用双引号引起来?
我正在使用 Ruby 的内置 CSV 生成一些 CSV 输出。一切正常,但客户希望输出中的名称字段包含双引号,以便输出看起来像输入文件。例如,输入看起来像这样:
1,1.1.1.1,"Firstname Lastname",more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
CSV 的输出是正确的,看起来像:
1,1.1.1.1,Firstname Lastname,more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
我知道 CSV 正在做正确的事情,因为它嵌入了空白,所以不双引号第三个字段,并用双引号包裹该字段当它嵌入逗号时使用引号。为了帮助客户感到温暖和模糊,我想做的是告诉 CSV 始终对第三个字段加双引号。
我尝试在我的 to_a
方法中将该字段用双引号括起来,这会创建一个传递给 CSV 的 "Firstname Lastname"
字段,但 CSV 嘲笑了我微不足道的人类尝试并输出 """Firstname Lastname"""
。这是正确的做法,因为它转义了双引号,所以这是行不通的。
然后我尝试设置 CSV 的 :force_quotes ==> true
在 open
方法中,它按预期输出双引号包裹所有字段,但客户不喜欢这样,这也是我所期望的。所以,这也不起作用。
我浏览了 Table 和 Row 文档,似乎没有任何东西可以让我访问“生成字符串字段”方法,或设置“for field n 始终使用引用”标志的方法。
我正要深入研究源代码,看看是否有一些超级秘密的调整,或者是否有一种方法可以对 CSV 进行猴子修补并弯曲它来执行我的意愿,但想知道是否有人有一些特殊知识或遇到过这个前。
是的,我知道我可以推出自己的 CSV 输出,但我不喜欢重新发明经过充分测试的轮子。而且,我还知道 FasterCSV;现在它是我正在使用的 Ruby 1.9.2 的一部分,因此明确使用 FasterCSV 并没有给我带来什么特别的。另外,我没有使用 Rails,也无意在 Rails 中重写它,所以除非您有一种使用 Rails 的一小部分来实现它的可爱方法,否则不要打扰。我会否决任何使用这些方法的建议,只是因为您没有费心读到这里。
I'm generating some CSV output using Ruby's built-in CSV. Everything works fine, but the customer wants the name field in the output to have wrapping double-quotes so the output looks like the input file. For instance, the input looks something like this:
1,1.1.1.1,"Firstname Lastname",more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
CSV's output, which is correct, looks like:
1,1.1.1.1,Firstname Lastname,more,fields
2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
I know CSV is doing the right thing by not double-quoting the third field just because it has embedded blanks, and wrapping the field with double-quotes when it has the embedded comma. What I'd like to do, to help the customer feel warm and fuzzy, is tell CSV to always double-quote the third field.
I tried wrapping the field in double-quotes in my to_a
method, which creates a "Firstname Lastname"
field being passed to CSV, but CSV laughed at my puny-human attempt and output """Firstname Lastname"""
. That is the correct thing to do because it's escaping the double-quotes, so that didn't work.
Then I tried setting CSV's :force_quotes => true
in the open
method, which output double-quotes wrapping all fields as expected, but the customer didn't like that, which I expected also. So, that didn't work either.
I've looked through the Table and Row docs and nothing appeared to give me access to the "generate a String field" method, or a way to set a "for field n always use quoting" flag.
I'm about to dive into the source to see if there's some super-secret tweaks, or if there's a way to monkey-patch CSV and bend it to do my will, but wondered if anyone had some special knowledge or had run into this before.
And, yes, I know I could roll my own CSV output, but I prefer to not reinvent well-tested wheels. And, I'm also aware of FasterCSV; That's now part of Ruby 1.9.2, which I'm using, so explicitly using FasterCSV buys me nothing special. Also, I'm not using Rails and have no intention of rewriting it in Rails, so unless you have a cute way of implementing it using a small subset of Rails, don't bother. I'll downvote any recommendations to use any of those ways just because you didn't bother to read this far.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
嗯,有一种方法可以做到这一点,但它并不像我希望的 CSV 代码那样干净。
我必须对 CSV 进行子类化,然后重写 CSV::Row.<<= 方法并添加另一个方法
forced_quote_fields=
才能定义我想要的字段强制引用,并从其他方法中提取两个 lambda。至少它适用于我想要的:这就是代码。调用它:
结果:
Well, there's a way to do it but it wasn't as clean as I'd hoped the CSV code could allow.
I had to subclass CSV, then override the
CSV::Row.<<=
method and add another methodforced_quote_fields=
to make it possible to define the fields I want to force-quoting on, plus pull two lambdas from other methods. At least it works for what I want:That's the code. Calling it:
results in:
这篇文章很旧,但我不敢相信没有人想到这一点。
为什么不这样做:
其中 \0 是空字符,然后只需在需要的每个字段中添加引号:
然后最后您可以执行
This post is old, but I can't believe no one thought of this.
Why not do:
where \0 is a null character, then just add quotes to each field where they are needed:
Then at the end you can do a
我怀疑这是否会帮助顾客在这么长时间后感到温暖和模糊,但这似乎有效:
I doubt if this will help the customer feeling warm and fuzzy after all this time, but this seems to work:
CSV
有一个force_quotes
选项,该选项将强制它引用所有字段(当您最初发布此内容时它可能不存在)。我意识到这并不完全是你所提议的,但它不是猴子修补。缺点是第一个整数值最终以字符串形式列出,这在导入 Excel 时会发生变化。
CSV
has aforce_quotes
option that will force it to quote all fields (it may not have been there when you posted this originally). I realize this isn't exactly what you were proposing, but it's less monkey patching.The drawback is that the first integer value ends up listed as a string, which changes things when you import into Excel.
已经过去很长一段时间了,但由于 CSV 库 已修补,这可能会对某人有所帮助,如果他们'现在面临这个问题:
输出将是:
It's been a long time, but since the CSV library has been patched, this might help someone if they're now facing this issue:
the output would be:
看起来除了猴子修补/重写之外,现有的 CSV 实现没有任何方法可以做到这一点。
但是,假设您对源数据具有完全控制权,则可以执行以下操作:
csv.gsub!(/FORCE_COMMAS,/, "")
It doesn't look like there's any way to do this with the existing CSV implementation short of monkey-patching/rewriting it.
However, assuming you have full control over the source data, you could do this:
csv.gsub!(/FORCE_COMMAS,/, "")
修改后的代码
The modified code