如何使用 FasterCSV 更改 CSV 文件中的标题,然后保存新标题?

发布于 2024-07-23 13:57:04 字数 404 浏览 5 评论 0原文

我无法理解 FasterCSV 中的 :header_converters 和 :converters 。 基本上,我想做的就是将列标题更改为适当的列名称。

类似于:

FasterCSV.foreach(csv_file, {:headers => true, :return_headers => false, :header_converters => :symbol, :converters => :all} ) do |row|
    puts row[:some_column_header] # Would be "Some Column Header" in the csv file.

execpt 我不明白转换器参数中的 :symbol 和 :all 。

I'm having trouble understanding the :header_converters and :converters in FasterCSV. Basically, all I want to do is change column headers to their appropriate column names.

something like:

FasterCSV.foreach(csv_file, {:headers => true, :return_headers => false, :header_converters => :symbol, :converters => :all} ) do |row|
    puts row[:some_column_header] # Would be "Some Column Header" in the csv file.

execpt I don't umderstand :symbol and :all in the converter parameters.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

把时间冻结 2024-07-30 13:57:04

:all 转换器意味着它会尝试所有内置转换器,具体来说:

:integer:   Converts any field Integer() accepts.
:float:     Converts any field Float() accepts.
:date:      Converts any field Date::parse() accepts.
:date_time: Converts any field DateTime::parse() accepts.

本质上,这意味着它将尝试将任何字段转换为这些值(如果可能),而不是将它们保留为细绳。 因此,如果您执行 row[i] ,它会返回字符串值“9”,它将返回整数值 9。

标头转换器更改标头用于索引行的方式。 例如,如果执行如下操作:

FastCSV.foreach(some_file, :header_converters => :downcase) do |row|

您可以将标题为“Some Header”的列索引为 row['some header']

如果您使用 :symbol 代替,则可以使用 row[:some_header] 对其进行索引。 Symbol 将标头名称小写,用下划线替换空格,并删除除 az、0-9 和 _ 之外的字符。 它很有用,因为符号比较比字符串比较快得多。

如果您想使用 row['Some Header'] 对列进行索引,则只需不要提供任何 :header_converter 选项即可。


编辑:

为了回应您的评论,恐怕 headers_convert 不会做您想要的事情。 它不会更改标题行的值,只是更改它们用作索引的方式。 相反,您必须使用 :return_headers 选项,检测标题行并进行更改。 要更改文件并再次将其写出,可以使用类似以下内容:

require 'fastercsv'

input = File.open 'original.csv', 'r'
output = File.open 'modified.csv', 'w'
FasterCSV.filter input, output, :headers => true, :write_headers => true, :return_headers => true do |row|
  change_headers(row) if row.header_row?
end
input.close
output.close

如果需要完全替换原始文件,请在执行上述操作后添加此行:

FileUtils.mv 'modified.csv', 'original.csv', :force => true

The :all converter means that it tries all of the built-in converters, specifically:

:integer:   Converts any field Integer() accepts.
:float:     Converts any field Float() accepts.
:date:      Converts any field Date::parse() accepts.
:date_time: Converts any field DateTime::parse() accepts.

Essentially, it means that it will attempt to convert any field into those values (if possible) instead of leaving them as a string. So if you do row[i] and it would have returned the String value '9', it will instead return an Integer value 9.

Header converters change the way the headers are used to index a row. For example, if doing something like this:

FastCSV.foreach(some_file, :header_converters => :downcase) do |row|

You would index a column with the header "Some Header" as row['some header'].

If you used :symbol instead, you would index it with row[:some_header]. Symbol downcases the header name, replaces spaces with underscores, and removes characters other than a-z, 0-9, and _. It's useful because comparison of symbols is far faster than comparison of strings.

If you want to index a column with row['Some Header'], then just don't provide any :header_converter option.


EDIT:

In response to your comment, headers_convert won't do what you want, I'm afraid. It doesn't change the values of the header row, just how they are used as an index. Instead, you'll have to use the :return_headers option, detect the header row, and make your changes. To change the file and write it out again, you can use something like this:

require 'fastercsv'

input = File.open 'original.csv', 'r'
output = File.open 'modified.csv', 'w'
FasterCSV.filter input, output, :headers => true, :write_headers => true, :return_headers => true do |row|
  change_headers(row) if row.header_row?
end
input.close
output.close

If you need to completely replace the original file, add this line after doing the above:

FileUtils.mv 'modified.csv', 'original.csv', :force => true
长途伴 2024-07-30 13:57:04

我找到了解决这个问题的简单方法。 FasterCSV 库工作得很好。 我确信从该帖子创建到现在大约 7 年可能与此有关,但我认为这里值得注意。

在我看来,读取 CSV 文件时,FasterCSV :header_converters 选项没有详细记录。 但是,除了分配符号 (header_converters: :symbol) 之外,还可以分配一个 lambda (header_converters: lambda {...})。 当 CSV 库读取文件时,它会使用 lambda 转换标头。 然后,可以保存反映转换后的标头的新 CSV 文件。

例如:

options = {
  headers: true,
  header_converters: lambda { |h| HEADER_MAP.keys.include?(h.to_sym) ? HEADER_MAP[h.to_sym] : h }
}

table = CSV.read(FILE_TO_PROCESS, options)

File.open(PROCESSED_FILE, "w") do |file|
  file.write(table.to_csv)
end

I've found a simple approach for solving this problem. The FasterCSV library works just fine. I'm sure that ~7 years between when the post was created to now may have something to do with it, but I thought it was worth noting here.

When reading CSV files, the FasterCSV :header_converters option isn't well documented, in my opinion. But, instead of assigning a symbol (header_converters: :symbol) one can assign a lambda (header_converters: lambda {...}). When the CSV library reads the file it transforms the headers using the lambda. Then, one can save a new CSV file that reflects the transformed headers.

For example:

options = {
  headers: true,
  header_converters: lambda { |h| HEADER_MAP.keys.include?(h.to_sym) ? HEADER_MAP[h.to_sym] : h }
}

table = CSV.read(FILE_TO_PROCESS, options)

File.open(PROCESSED_FILE, "w") do |file|
  file.write(table.to_csv)
end
心不设防 2024-07-30 13:57:04

对于将导出的 CSV 文件转换为导入的任何人来说,重写 CSV 文件头是一项常见要求。

我发现以下方法满足了我的需要:

lookup_headers = { "old": "new", "cat": "dog" } # The desired header swaps

CSV(
gt;, headers: true, write_headers: true) do |csv_out|
  CSV.foreach( ARGV[0],
               headers: true, 
               # the following lambda replaces the header if it is found, leaving it if not...
               header_converters: lambda{ |h| lookup_headers[h] || h}, 
               return_headers: true) do |master_row|

    if master_row.header_row?
      # The headers are now correctly replaced by calling the updated headers
      csv_out << master_row.headers
    else
      csv_out << master_row
    end
  end
end

希望这有帮助!

Rewriting CSV file headers is a common requirement for anyone converting exported CSV files into imports.

I found the following approach gave me what I needed:

lookup_headers = { "old": "new", "cat": "dog" } # The desired header swaps

CSV(
gt;, headers: true, write_headers: true) do |csv_out|
  CSV.foreach( ARGV[0],
               headers: true, 
               # the following lambda replaces the header if it is found, leaving it if not...
               header_converters: lambda{ |h| lookup_headers[h] || h}, 
               return_headers: true) do |master_row|

    if master_row.header_row?
      # The headers are now correctly replaced by calling the updated headers
      csv_out << master_row.headers
    else
      csv_out << master_row
    end
  end
end

Hope this helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文