从文本文件中读取固定宽度记录

发布于 2024-07-06 09:31:06 字数 816 浏览 9 评论 0原文

我有一个充满记录的文本文件,其中每个记录中的每个字段都是固定宽度。 我的第一种方法是简单地使用 string.Substring() 解析每个记录。 有没有更好的办法?

例如,格式可以描述为:

<Field1(8)><Field2(16)><Field3(12)>

包含两条记录的示例文件可能如下所示:

SomeData0000000000123456SomeMoreData
Data2   0000000000555555MoreData    

我只是想确保我没有忽略比 Substring() 更优雅的方式。


更新:我最终使用了像Killersponge建议的正则表达式:

private readonly Regex reLot = new Regex(REGEX_LOT, RegexOptions.Compiled);
const string REGEX_LOT = "^(?<Field1>.{6})" +
                        "(?<Field2>.{16})" +
                        "(?<Field3>.{12})";

然后我使用以下命令来访问字段:

Match match = reLot.Match(record);
string field1 = match.Groups["Field1"].Value;

I've got a text file full of records where each field in each record is a fixed width. My first approach would be to parse each record simply using string.Substring(). Is there a better way?

For example, the format could be described as:

<Field1(8)><Field2(16)><Field3(12)>

And an example file with two records could look like:

SomeData0000000000123456SomeMoreData
Data2   0000000000555555MoreData    

I just want to make sure I'm not overlooking a more elegant way than Substring().


Update: I ultimately went with a regex like Killersponge suggested:

private readonly Regex reLot = new Regex(REGEX_LOT, RegexOptions.Compiled);
const string REGEX_LOT = "^(?<Field1>.{6})" +
                        "(?<Field2>.{16})" +
                        "(?<Field3>.{12})";

I then use the following to access the fields:

Match match = reLot.Match(record);
string field1 = match.Groups["Field1"].Value;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

泪冰清 2024-07-13 09:31:06

使用FileHelpers

示例:

[FixedLengthRecord()] 
public class MyData
{ 
  [FieldFixedLength(8)] 
  public string someData; 

  [FieldFixedLength(16)] 
  public int SomeNumber; 

  [FieldFixedLength(12)] 
  [FieldTrim(TrimMode.Right)]
  public string someMoreData;
}

那么,就这么简单:

var engine = new FileHelperEngine<MyData>(); 

// To Read Use: 
var res = engine.ReadFile("FileIn.txt"); 

// To Write Use: 
engine.WriteFile("FileOut.txt", res); 

Use FileHelpers.

Example:

[FixedLengthRecord()] 
public class MyData
{ 
  [FieldFixedLength(8)] 
  public string someData; 

  [FieldFixedLength(16)] 
  public int SomeNumber; 

  [FieldFixedLength(12)] 
  [FieldTrim(TrimMode.Right)]
  public string someMoreData;
}

Then, it's as simple as this:

var engine = new FileHelperEngine<MyData>(); 

// To Read Use: 
var res = engine.ReadFile("FileIn.txt"); 

// To Write Use: 
engine.WriteFile("FileOut.txt", res); 
如梦初醒的夏天 2024-07-13 09:31:06

子串对我来说听起来不错。 我能立即想到的唯一缺点是,这意味着每次都要复制数据,但在你证明这是一个瓶颈之前,我不会担心这一点。 子字符串很简单:)

可以使用正则表达式一次匹配整个记录并捕获字段,但我认为这有点过头了。

Substring sounds good to me. The only downside I can immediately think of is that it means copying the data each time, but I wouldn't worry about that until you prove it's a bottleneck. Substring is simple :)

You could use a regex to match a whole record at a time and capture the fields, but I think that would be overkill.

只怪假的太真实 2024-07-13 09:31:06

为什么要重新发明轮子? 按照以下方式使用 .NET 的 TextFieldParser 类-to 对于 Visual Basic:如何读取固定宽度的文本文件

Why reinvent the wheel? Use .NET's TextFieldParser class per this how-to for Visual Basic: How to read from fixed-width text files.

我不会写诗 2024-07-13 09:31:06

您可能需要注意,如果行的末尾没有填充空格来填充字段,那么您的子字符串将无法工作,除非稍微调整一下以计算出该行还有多少内容需要读取。 这当然只适用于最后一个字段:)

You may have to watch out, if the end of the lines aren't padded out with spaces to fill the field, your substring won't work without a bit of fiddling to work out how much more of the line there is to read. This of course only applies to the last field :)

浮光之海 2024-07-13 09:31:06

不幸的是,开箱即用的 CLR 仅为此提供子字符串。

CodeProject 的某人使用属性定义字段制作了一个自定义解析器,您可能会想看看那个。

Unfortunately out of the box the CLR only provides Substring for this.

Someone over at CodeProject made a custom parser using attributes to define fields, you might wanna look at that.

指尖微凉心微凉 2024-07-13 09:31:06

不,子字符串没问题。 这就是它的用途。

Nope, Substring is fine. That's what it's for.

一袭白衣梦中忆 2024-07-13 09:31:06

您可以为固定格式文件设置 ODBC 数据源,然后像访问任何其他数据库表一样访问它。
这样做的另一个优点是,当有人决定在中间添加一个额外的字段时,文件格式的特定知识不会编译到您的代码中。

You could set up an ODBC data source for the fixed format file, and then access it as any other database table.
This has the added advantage that specific knowledge of the file format is not compiled into your code for that fateful day that someone decides to stick an extra field in the middle.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文