使用文件助手; 如何解析此 CSV 类型

发布于 2024-07-22 08:15:18 字数 355 浏览 7 评论 0原文

尝试使用 FileHelpers 库解析以下格式的 CSV 时遇到一些问题。 这让我有点困惑,因为字段分隔符似乎是一个空格,但字段本身有时用引号引起来,有时用方括号引起。 我正在尝试生成一个能够解析此内容的 RecordClass。

以下是 CSV 中的示例:

xxx.xxx.xxx.xxx - - [14/Jun/2008:18:04:17 +0000] "GET http://www.some_url.com HTTP/1.1" 200 73662339 "-" "iTunes/7.6.2 (Macintosh; N; Intel)"

它是我们从带宽提供商之一收到的 HTTP 日志的摘录。

Having a few problems trying to parse a CSV in the following format using the FileHelpers library. It's confusing me slightly because the field delimiter appears to be a space, but the fields themselves are sometimes quoted with quotation marks, and other times by square brackets. I'm trying to produce a RecordClass capable of parsing this.

Here's a sample from the CSV:

xxx.xxx.xxx.xxx - - [14/Jun/2008:18:04:17 +0000] "GET http://www.some_url.com HTTP/1.1" 200 73662339 "-" "iTunes/7.6.2 (Macintosh; N; Intel)"

It's an extract from an HTTP log we receive from one of our bandwidth providers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

一袭水袖舞倾城 2024-07-29 08:15:18

虽然我感谢 Marc Gravell 和 Jon Skeet 的意见,但我的问题是如何解析包含使用 FileHelpers 库描述的格式的行的文件(尽管我一开始措辞很糟糕,描述了“CSV”,但实际上) ,它不是)。

我现在找到了一种方法来做到这一点。 这不是特别优雅的方法,但是它可以完成工作。 在理想的情况下,我不会在这个特定的实现中使用 FileHelpers ;)

对于那些感兴趣的人,解决方案是创建一个 FileRecord 类,如下所示:

[DelimitedRecord(" ")]
public sealed class HTTPRecord
{

public String IP;

// Fields with prefix 'x' are useless to me... we omit those in processing later
public String x1;
[FieldDelimiter("[")]
public String x2;


[FieldDelimiter("]")]
public String Timestamp;

[FieldDelimiter("\"")]
public String x3;

public String Method;
public String URL;

[FieldDelimiter("\"")]
public String Type;

[FieldIgnored()]
public String x4;

[FieldDelimiter(" ")]
public String x5;

public int HTTPStatusCode;

public long Bytes;

[FieldQuoted()] 
public String Referer;

[FieldQuoted()] 
public String UserAgent;
}

While I thank Marc Gravell and Jon Skeet for their input, my question was how to go about parsing a file containing lines in the format described using the FileHelpers library (albeit, I worded it badly to begin with, describing 'CSV' when in fact, it isn't).

I have now found a way to do just this. It's not particularly the most elegant method, however, it gets the job done. In an ideal world, I wouldn't be using FileHelpers in this particular implementation ;)

For those who are interested, the solution is to create a FileRecord class as follows:

[DelimitedRecord(" ")]
public sealed class HTTPRecord
{

public String IP;

// Fields with prefix 'x' are useless to me... we omit those in processing later
public String x1;
[FieldDelimiter("[")]
public String x2;


[FieldDelimiter("]")]
public String Timestamp;

[FieldDelimiter("\"")]
public String x3;

public String Method;
public String URL;

[FieldDelimiter("\"")]
public String Type;

[FieldIgnored()]
public String x4;

[FieldDelimiter(" ")]
public String x5;

public int HTTPStatusCode;

public long Bytes;

[FieldQuoted()] 
public String Referer;

[FieldQuoted()] 
public String UserAgent;
}
撩起发的微风 2024-07-29 08:15:18

明显的说法是“那么它不是 CSV”...

我很想使用快速正则表达式将日期与其他所有内容一样转义...逐行基础上,类似:

string t = Regex.Replace(s, @"\[([^\]]*)\]", @"""$1""")

那么您应该能够使用标准解析器,使用空格作为分隔符(尊重引号)。

The obvious statement is "then it isn't CSV"...

I'd be tempted to use a quick regex to munge the date into the same escaping as everything else... on a line-by-line basis, something like:

string t = Regex.Replace(s, @"\[([^\]]*)\]", @"""$1""")

Then you should be able to use a standard parser using space as a delimiter (respecting quotes).

守护在此方 2024-07-29 08:15:18

那个 CSV 是怎样的? 看起来它只是一种特定的日志文件格式,应该相当容易解析,但不能由 CSV 解析器解析。 特别是,您可能会发现正则表达式运行得非常好。 (您需要检查用户代理等中的引号会发生什么情况。)

In what way is that CSV? It looks like it's just a particular log file format which should be fairly easily parsed, but not by a CSV parser. In particular, you may well find that a regex works perfectly well. (You'd need to check what would happen to quotes in the user agent etc.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文