FasterCSV:在接受文件之前检查文件是否无效 - 有更简单的方法吗?

发布于 2024-11-05 21:46:25 字数 501 浏览 0 评论 0原文

我在 Ruby on Rails 应用程序上使用 FasterCSV,目前如果文件无效,它会抛出异常。

我查看了 FasterCSV 文档,似乎如果我使用 FasterCSV::parse 带有一个块,它会一次读取文件一行,而不分配太多内存。如果存在任何类型的错误,它将抛出 FasterCSV::MalformedCSV 异常在文件上。

我已经实现了一个自定义解决方案,但我不确定它是最好的解决方案(请参阅下面的我的答案)。我有兴趣了解替代方案

I'm using FasterCSV on a Ruby on Rails application and currently it throws an Exception if the file is invalid.

I've looked over the FasterCSV doc, and it seems that if I use FasterCSV::parse with a block, it'll read the file one line at a time, without allocating too much memory. It'll throw a FasterCSV::MalformedCSV exception if there is any kind of error on the file.

I've implemented a custom solution, but I'm not sure it's the best possible one (see my answer below). I'd be interested in knowing alternatives

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

暮年 2024-11-12 21:46:25

这是我目前的解决方案。我真的很想了解改进/替代方案。

# /lib/fastercsv_is_valid.rb

class FasterCSV

  def self.is_valid?(file, options = {})
    begin
      FasterCSV.parse(file, options) { |row| }
      true
    rescue FasterCSV::MalformedCSV
      false
    end
  end

end

我使用这样的方法:

# /models/csv_importer.rb

class CsvImporter
  include ActiveRecord::Validations

  validates_presence_of :file
  validate check_file_format

...

  private

  def check_file_format
    errors.add :file, "Malformed CSV! Please check syntax" unless FasterCSV::is_valid? file
  end
end

This is my current solution. I'm really interested in knowing improvements / alternatives.

# /lib/fastercsv_is_valid.rb

class FasterCSV

  def self.is_valid?(file, options = {})
    begin
      FasterCSV.parse(file, options) { |row| }
      true
    rescue FasterCSV::MalformedCSV
      false
    end
  end

end

I use that method like this:

# /models/csv_importer.rb

class CsvImporter
  include ActiveRecord::Validations

  validates_presence_of :file
  validate check_file_format

...

  private

  def check_file_format
    errors.add :file, "Malformed CSV! Please check syntax" unless FasterCSV::is_valid? file
  end
end
乄_柒ぐ汐 2024-11-12 21:46:25

我昨天做了一些测试,结果发现我的解决方案不太有效;在实现第一个 is_valid 后,我不断在有效的 CSV 上获取空数组。我不确定这是 FasterCSV 缓存问题还是我的代码中的某些问题,而且我不知道它是否与我的测试设置有关,但我决定改为实现一个 safe_parse

#/lib/faster_csv_safe_parse.rb
class FasterCSV

  def self.safe_parse(file, options = {})
    begin
      FasterCSV.parse(file, options)
    rescue FasterCSV::MalformedCSVError
      nil
    end
  end

end

这将如果文件有效,则返回已解析的数组,否则返回 nil。然后我可以按如下方式实现我的验证:

# /models/csv_importer.rb

class CsvImporter
  include ActiveRecord::Validations

  validates_presence_of :file
  validate check_file_format
  attr_accessor csv_data

  def csv_data
    @csv_data ||= FasterCSV.safe_parse(file)
  end

...

  private

  def check_file_format
    errors.add :file, "Malformed CSV! Please check syntax" if csv_data.nil?
  end
end

我想可以实现一个接受块并逐行解析文件的 safe_parse ,但对于我的目的来说,这个简单的实现就足够了,而且它在所有情况下都有效。

I made some tests yesterday and it turns out that my solution didn't quite work; I kept getting empty arrays on valid CSVs after implementing the first is_valid . I'm not sure whether it's a FasterCSV caching issue or something in my code, and I don't know if it's related with my test setup, but I decided to go implement a safe_parse instead:

#/lib/faster_csv_safe_parse.rb
class FasterCSV

  def self.safe_parse(file, options = {})
    begin
      FasterCSV.parse(file, options)
    rescue FasterCSV::MalformedCSVError
      nil
    end
  end

end

This will return a parsed array if the file is valid, or nil otherwise. I could then implement my validations as follows:

# /models/csv_importer.rb

class CsvImporter
  include ActiveRecord::Validations

  validates_presence_of :file
  validate check_file_format
  attr_accessor csv_data

  def csv_data
    @csv_data ||= FasterCSV.safe_parse(file)
  end

...

  private

  def check_file_format
    errors.add :file, "Malformed CSV! Please check syntax" if csv_data.nil?
  end
end

I guess it would be possible to implement a safe_parse that accepts a block and parses the file line by line, but for my purposes this simple implementation was enough, and it works in all cases.

笑咖 2024-11-12 21:46:25

我假设您想要解析 CSV 并对解析的结果执行一些操作。最坏的情况是您的 CSV 有效并且您再次解析该文件。我会写这样的东西来隐藏解析的结果,这样你只需要解析 CSV 一次:

module FasterCSV

  def self.parse_and_validate(file, options = {})

    begin
      @parsed_result = FasterCSV.parse(file, options) { |row| }
    rescue FasterCSV::MalformedCSV
      @invalid = true
    end
  end

  def self.is_valid?
    !@invalid
  end    

  def self.parsed_result
    @parsed_result if self.valid?
  end

end

然后:

class CsvImporter
  include ActiveRecord::Validations

  validates_presence_of :file
  validate check_file_format

  # I assume you use the parsed result after the validations so in a before_save or something
  def do_your_parse_stuff
    here you would use FasterCSV::parsed_result
  end
...

  private

  def check_file_format
    FasterCSV::parse_and_validate(file)
    errors.add :file, "Malformed CSV! Please check syntax" unless FasterCSV::is_valid?
  end
end

在上面的情况下,你可能想将东西移到另一个类中,该类负责与 FasterCSV 通信并隐藏解析的结果,因为我认为我的示例不是线程安全的:)

I assume you want to parse the CSV and do something with the parsed results. Worst case is that your CSV is valid and that you parse the file again. I would write something like this to stash away the parsed result so you only have to parse the CSV once:

module FasterCSV

  def self.parse_and_validate(file, options = {})

    begin
      @parsed_result = FasterCSV.parse(file, options) { |row| }
    rescue FasterCSV::MalformedCSV
      @invalid = true
    end
  end

  def self.is_valid?
    !@invalid
  end    

  def self.parsed_result
    @parsed_result if self.valid?
  end

end

And then:

class CsvImporter
  include ActiveRecord::Validations

  validates_presence_of :file
  validate check_file_format

  # I assume you use the parsed result after the validations so in a before_save or something
  def do_your_parse_stuff
    here you would use FasterCSV::parsed_result
  end
...

  private

  def check_file_format
    FasterCSV::parse_and_validate(file)
    errors.add :file, "Malformed CSV! Please check syntax" unless FasterCSV::is_valid?
  end
end

In the above case, you might want to move stuff into a different class that takes care of communicating with FasterCSV and stashing away the parsed result, because I don't think my example is thread safe :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文