如何通过特定的行分隔符读取文本文件?

发布于 2024-11-19 09:40:24 字数 221 浏览 2 评论 0原文

使用流读取器读取文本文件。

using (StreamReader sr = new StreamReader(FileName, Encoding.Default))
{
     string line = sr.ReadLine();
}

我想强制行分隔符应该是 \n 而不是 \r。那么我该怎么做呢?

Reading a text file using streamreader.

using (StreamReader sr = new StreamReader(FileName, Encoding.Default))
{
     string line = sr.ReadLine();
}

I want to force that line delimiter should be \n not \r. So how can i do that?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

厌味 2024-11-26 09:40:24

我会实现类似乔治的答案,但作为一种扩展方法,可以避免一次加载整个文件(未经测试,但类似这样):

static class ExtensionsForTextReader
{
     public static IEnumerable<string> ReadLines (this TextReader reader, char delimiter)
     {
            List<char> chars = new List<char> ();
            while (reader.Peek() >= 0)
            {
                char c = (char)reader.Read ();

                if (c == delimiter) {
                    yield return new String(chars.ToArray());
                    chars.Clear ();
                    continue;
                }

                chars.Add(c);
            }
     }
}

然后可以像这样使用:

using (StreamReader sr = new StreamReader(FileName, Encoding.Default))
{
     foreach (var line in sr.ReadLines ('\n'))
           Console.WriteLine (line);
}

I would implement something like George's answer, but as an extension method that avoids loading the whole file at once (not tested, but something like this):

static class ExtensionsForTextReader
{
     public static IEnumerable<string> ReadLines (this TextReader reader, char delimiter)
     {
            List<char> chars = new List<char> ();
            while (reader.Peek() >= 0)
            {
                char c = (char)reader.Read ();

                if (c == delimiter) {
                    yield return new String(chars.ToArray());
                    chars.Clear ();
                    continue;
                }

                chars.Add(c);
            }
     }
}

Which could then be used like:

using (StreamReader sr = new StreamReader(FileName, Encoding.Default))
{
     foreach (var line in sr.ReadLines ('\n'))
           Console.WriteLine (line);
}
洒一地阳光 2024-11-26 09:40:24
string text = sr.ReadToEnd();
string[] lines = text.Split('\r');
foreach(string s in lines)
{
   // Consume
}
string text = sr.ReadToEnd();
string[] lines = text.Split('\r');
foreach(string s in lines)
{
   // Consume
}
千里故人稀 2024-11-26 09:40:24

我喜欢@Pete 给出的答案。我只想提交一个小小的修改。这将允许您传递字符串分隔符而不仅仅是单个字符:

using System;
using System.IO;
using System.Collections.Generic;
internal static class StreamReaderExtensions
{
    public static IEnumerable<string> ReadUntil(this StreamReader reader, string delimiter)
    {
        List<char> buffer = new List<char>();
        CircularBuffer<char> delim_buffer = new CircularBuffer<char>(delimiter.Length);
        while (reader.Peek() >= 0)
        {
            char c = (char)reader.Read();
            delim_buffer.Enqueue(c);
            if (delim_buffer.ToString() == delimiter || reader.EndOfStream)
            {
                if (buffer.Count > 0)
                {
                    if (!reader.EndOfStream)
                    {
                        yield return new String(buffer.ToArray()).Replace(delimiter.Substring(0, delimiter.Length - 1), string.Empty);
                    }
                    else
                    {
                        buffer.Add(c);
                        yield return new String(buffer.ToArray());
                    }
                    buffer.Clear();
                }
                continue;
            }
            buffer.Add(c);
        }
    }

    private class CircularBuffer<T> : Queue<T>
    {
        private int _capacity;

        public CircularBuffer(int capacity)
            : base(capacity)
        {
            _capacity = capacity;
        }

        new public void Enqueue(T item)
        {
            if (base.Count == _capacity)
            {
                base.Dequeue();
            }
            base.Enqueue(item);
        }

        public override string ToString()
        {
            List<String> items = new List<string>();
            foreach (var x in this)
            {
                items.Add(x.ToString());
            };
            return String.Join("", items);
        }
    }
}

I loved the answer @Pete gave. I would just like to submit a slight modification. This will allow you to pass a string delimiter instead of just a single character:

using System;
using System.IO;
using System.Collections.Generic;
internal static class StreamReaderExtensions
{
    public static IEnumerable<string> ReadUntil(this StreamReader reader, string delimiter)
    {
        List<char> buffer = new List<char>();
        CircularBuffer<char> delim_buffer = new CircularBuffer<char>(delimiter.Length);
        while (reader.Peek() >= 0)
        {
            char c = (char)reader.Read();
            delim_buffer.Enqueue(c);
            if (delim_buffer.ToString() == delimiter || reader.EndOfStream)
            {
                if (buffer.Count > 0)
                {
                    if (!reader.EndOfStream)
                    {
                        yield return new String(buffer.ToArray()).Replace(delimiter.Substring(0, delimiter.Length - 1), string.Empty);
                    }
                    else
                    {
                        buffer.Add(c);
                        yield return new String(buffer.ToArray());
                    }
                    buffer.Clear();
                }
                continue;
            }
            buffer.Add(c);
        }
    }

    private class CircularBuffer<T> : Queue<T>
    {
        private int _capacity;

        public CircularBuffer(int capacity)
            : base(capacity)
        {
            _capacity = capacity;
        }

        new public void Enqueue(T item)
        {
            if (base.Count == _capacity)
            {
                base.Dequeue();
            }
            base.Enqueue(item);
        }

        public override string ToString()
        {
            List<String> items = new List<string>();
            foreach (var x in this)
            {
                items.Add(x.ToString());
            };
            return String.Join("", items);
        }
    }
}
烟织青萝梦 2024-11-26 09:40:24

我需要一个读取到“\r\n”并且不会在“\n”处停止的解决方案。 jp1980 的解决方案有效,但处理大文件时速度极慢。因此,我将 Mike Sackton 的解决方案转换为读取,直到找到指定的字符串。

public static string ReadLine(this StreamReader sr, string lineDelimiter)
    {
        StringBuilder line = new StringBuilder();
        var matchIndex = 0;

        while (sr.Peek() > 0)
        {
            var nextChar = (char)sr.Read();
            line.Append(nextChar);

            if (nextChar == lineDelimiter[matchIndex])
            {
                if (matchIndex == lineDelimiter.Length - 1)
                {
                    return line.ToString().Substring(0, line.Length - lineDelimiter.Length);
                }
                matchIndex++;
            }
            else
            {
                matchIndex = 0;
                //did we mistake one of the characters as the delimiter? If so let's restart our search with this character...
                if (nextChar == lineDelimiter[matchIndex])
                {
                    if (matchIndex == lineDelimiter.Length - 1)
                    {
                        return line.ToString().Substring(0, line.Length - lineDelimiter.Length);
                    }
                    matchIndex++;
                }
            }
        }

        return line.Length == 0
            ? null
            : line.ToString();
    }

它的名字是这样的......

using (StreamReader reader = new StreamReader(file))
{
    string line;
    while((line = reader.ReadLine("\r\n")) != null)
    {
        Console.WriteLine(line);
    }
}

I needed a solution that reads until "\r\n", and does not stop at "\n". jp1980's solution worked, but was extremely slow on a large file. So, I converted Mike Sackton's solution to read until a specified string is found.

public static string ReadLine(this StreamReader sr, string lineDelimiter)
    {
        StringBuilder line = new StringBuilder();
        var matchIndex = 0;

        while (sr.Peek() > 0)
        {
            var nextChar = (char)sr.Read();
            line.Append(nextChar);

            if (nextChar == lineDelimiter[matchIndex])
            {
                if (matchIndex == lineDelimiter.Length - 1)
                {
                    return line.ToString().Substring(0, line.Length - lineDelimiter.Length);
                }
                matchIndex++;
            }
            else
            {
                matchIndex = 0;
                //did we mistake one of the characters as the delimiter? If so let's restart our search with this character...
                if (nextChar == lineDelimiter[matchIndex])
                {
                    if (matchIndex == lineDelimiter.Length - 1)
                    {
                        return line.ToString().Substring(0, line.Length - lineDelimiter.Length);
                    }
                    matchIndex++;
                }
            }
        }

        return line.Length == 0
            ? null
            : line.ToString();
    }

And it is called like this...

using (StreamReader reader = new StreamReader(file))
{
    string line;
    while((line = reader.ReadLine("\r\n")) != null)
    {
        Console.WriteLine(line);
    }
}
﹏半生如梦愿梦如真 2024-11-26 09:40:24

根据文档:

http://msdn.microsoft。 com/en-us/library/system.io.streamreader.readline.aspx

一行被定义为一个字符序列,后跟一个换行符
(“\n”)、回车符(“\r”)或立即回车符
后跟换行符(“\r\n”)。

默认情况下,StreamReader ReadLine 方法将通过 \n 或 \r 识别一行

According to the documentation:

http://msdn.microsoft.com/en-us/library/system.io.streamreader.readline.aspx

A line is defined as a sequence of characters followed by a line feed
("\n"), a carriage return ("\r"), or a carriage return immediately
followed by a line feed ("\r\n").

By default the StreamReader ReadLine method will recognise a line by both/either \n or \r

辞别 2024-11-26 09:40:24

这是 sovemp 答案的改进。抱歉,我本来想发表评论,尽管我的声誉不允许我这样做。此改进解决了 2 个问题:

  1. 带有分隔符“\r\n”的示例序列“text\rtest\r\n”也将
    删除第一个不想要的“\r”。
  2. 当流中的最后一个字符等于分隔符时,函数将
    错误地返回包含分隔符的字符串。

    使用系统;
    使用系统.IO;
    使用 System.Collections.Generic;
    内部静态类 StreamReaderExtensions
    {
        公共静态 IEnumerable; ReadUntil(此 StreamReader 读取器,字符串分隔符)
        {
            列表<字符> buffer = new List();
            循环缓冲区 delim_buffer = new CircularBuffer(delimiter.Length);
            while (reader.Peek() >= 0)
            {
                char c = (char)reader.Read();
                delim_buffer.Enqueue(c);
                if (delim_buffer.ToString() == 分隔符 || reader.EndOfStream)
                {
                    if (buffer.Count > 0)
                    {
                        if (!reader.EndOfStream)
                        {
                            缓冲区.Add(c);
                            yield return new String(buffer.ToArray()).Substring(0, buffer.Count - delimeter.Length);
                        }
                        别的
                        {
                            缓冲区.Add(c);
                            if (delim_buffer.ToString() != 分隔符)
                                产量返回新字符串(buffer.ToArray());
                            别的
                                yield return new String(buffer.ToArray()).Substring(0, buffer.Count - delimeter.Length);
                        }
                        缓冲区.Clear();
                    }
                    继续;
                }
                缓冲区.Add(c);
            }
        }
    
        私有类 CircularBuffer : 队列
        {
            私有 int _容量;
    
            公共CircularBuffer(int容量)
                :基础(容量)
            {
                _容量=容量;
            }
    
            新的公共无效入队(T项)
            {
                if (base.Count == _capacity)
                {
                    出队();
                }
                入队(项目);
            }
    
            公共覆盖字符串 ToS​​tring()
            {
                列表<字符串> items = new List();
                foreach(此处为 var x)
                {
                    items.Add(x.ToString());
                };
                return String.Join("", items);
            }
        }
    }
    

This is an improvement of sovemp answer. Sorry I would have liked to comment, although my reputation doesn't allow me to do so. This improvement addresses 2 issues:

  1. example sequence "text\rtest\r\n" with delimiter "\r\n" would also
    delete the first "\r" which is not intended.
  2. when last characters in stream equals delimiter, function would
    wrongly return string including delimiters.

    using System;
    using System.IO;
    using System.Collections.Generic;
    internal static class StreamReaderExtensions
    {
        public static IEnumerable<string> ReadUntil(this StreamReader reader, string delimiter)
        {
            List<char> buffer = new List<char>();
            CircularBuffer<char> delim_buffer = new CircularBuffer<char>(delimiter.Length);
            while (reader.Peek() >= 0)
            {
                char c = (char)reader.Read();
                delim_buffer.Enqueue(c);
                if (delim_buffer.ToString() == delimiter || reader.EndOfStream)
                {
                    if (buffer.Count > 0)
                    {
                        if (!reader.EndOfStream)
                        {
                            buffer.Add(c);
                            yield return new String(buffer.ToArray()).Substring(0, buffer.Count - delimeter.Length);
                        }
                        else
                        {
                            buffer.Add(c);
                            if (delim_buffer.ToString() != delimiter)
                                yield return new String(buffer.ToArray());
                            else
                                yield return new String(buffer.ToArray()).Substring(0, buffer.Count - delimeter.Length);
                        }
                        buffer.Clear();
                    }
                    continue;
                }
                buffer.Add(c);
            }
        }
    
        private class CircularBuffer<T> : Queue<T>
        {
            private int _capacity;
    
            public CircularBuffer(int capacity)
                : base(capacity)
            {
                _capacity = capacity;
            }
    
            new public void Enqueue(T item)
            {
                if (base.Count == _capacity)
                {
                    base.Dequeue();
                }
                base.Enqueue(item);
            }
    
            public override string ToString()
            {
                List<String> items = new List<string>();
                foreach (var x in this)
                {
                    items.Add(x.ToString());
                };
                return String.Join("", items);
            }
        }
    }
    
无风消散 2024-11-26 09:40:24

您要么必须自己逐字节解析流并处理分割,要么需要使用默认的 ReadLine 行为,该行为在 /r、/n 或 /r/n 上分割。

如果你想逐字节解析流,我会使用类似以下扩展方法:

 public static string ReadToChar(this StreamReader sr, char splitCharacter)
    {        
        char nextChar;
        StringBuilder line = new StringBuilder();
        while (sr.Peek() > 0)
        {               
            nextChar = (char)sr.Read();
            if (nextChar == splitCharacter) return line.ToString();
            line.Append(nextChar);
        }

        return line.Length == 0 ? null : line.ToString();
    }

You either have to parse the stream byte-by-byte yourself and handle the split, or you need to use the default ReadLine behavior which splits on /r, /n, or /r/n.

If you want to parse the stream byte-by-byte, I'd use something like the following extension method:

 public static string ReadToChar(this StreamReader sr, char splitCharacter)
    {        
        char nextChar;
        StringBuilder line = new StringBuilder();
        while (sr.Peek() > 0)
        {               
            nextChar = (char)sr.Read();
            if (nextChar == splitCharacter) return line.ToString();
            line.Append(nextChar);
        }

        return line.Length == 0 ? null : line.ToString();
    }
别再吹冷风 2024-11-26 09:40:24

即使你说“使用 StreamReader”,因为你也说“我的情况,文件可以有大量记录......”,我建议尝试 SSIS。它非常适合您想要做的事情。您可以处理非常大的文件并轻松指定行/列分隔符。

Even though you said "Using StreamReader", since you also said "I my case, file can have tons of records...", I would recommend trying SSIS. It's perfect for what you're trying to do. You can process very large file and specify the line/column delimiters easily.

一萌ing 2024-11-26 09:40:24

此代码片段将从文件中读取一行,直到遇到“\n”。

using (StreamReader sr = new StreamReader(path)) 
{
     string line = string.Empty;
     while (sr.Peek() >= 0) 
     {
          char c = (char)sr.Read();
          if (c == '\n')
          {
              //end of line encountered
              Console.WriteLine(line);
              //create new line
              line = string.Empty;
          }
          else
          {
               line += (char)sr.Read();
          }
     }
}

因为此代码逐字符读取,所以它可以处理任何长度的文件,而不受可用内存的限制。

This code snippet will read a line from a file until it encounters "\n".

using (StreamReader sr = new StreamReader(path)) 
{
     string line = string.Empty;
     while (sr.Peek() >= 0) 
     {
          char c = (char)sr.Read();
          if (c == '\n')
          {
              //end of line encountered
              Console.WriteLine(line);
              //create new line
              line = string.Empty;
          }
          else
          {
               line += (char)sr.Read();
          }
     }
}

Because this code reads character by character it will work with a file of any length without being constrained by available memory.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文