如何将缓冲区填充到行尾

发布于 2024-12-26 07:00:19 字数 1324 浏览 1 评论 0原文

                int j = (1024 * 1024); // = 1 megabyte
                char[] buffer = new char[j];
                int charsRead = 0;
                while ((charsRead = sr.Read(buffer, 0, buffer.Length)) > 0)
                {
                    string john = new string(buffer, 0, charsRead);
                    sw.WriteLine(john);                        
                }

这是我第一次使用缓冲区的经验，上面的代码做了我想要的，除了缓冲区的结尾与正在读取的文本文件中的行的结尾不一致。这将导致您在下面看到的结果。请记住，由于源文件中的每一行的长度可能不同，因此中断并不总是发生在行中的同一位置：

john likes to farm cattle
john likes to farm beetles
john likes to farm rabbits
john likes to farm carrots
john likes to farm b      <---1MB buffer ends here
ears                      <---new 1MB buffer begins here
john likes to farm antelope
john likes to farm rabies
john likes to farm lions

那么有没有办法拥有指定大小的缓冲区（本例中为 1MB）），但只到最后一行的末尾才达到 1MB（因此缓冲区的大小很可能始终略小于 1MB）？我猜测该过程的一部分将涉及定义一条线到底是什么（幸运的是我现在知道如何做到这一点），但之后我不知道我需要做什么。

我能想到的唯一解决方案是在将缓冲区的内容写入文件后进行遍历并搜索不完整的行并将它们与中断的行重新连接。但这看起来效率确实很低。

编辑：我忘记包含正在读取的源文件的格式：

john likes to farm cattle
john likes to farm beetles
john likes to farm rabbits
john likes to farm carrots
john likes to farm bears
john likes to farm antelope
john likes to farm rabies
john likes to farm lions

原文

                int j = (1024 * 1024); // = 1 megabyte
                char[] buffer = new char[j];
                int charsRead = 0;
                while ((charsRead = sr.Read(buffer, 0, buffer.Length)) > 0)
                {
                    string john = new string(buffer, 0, charsRead);
                    sw.WriteLine(john);                        
                }

This is my first experience with using a buffer, and the above code does what I want, EXCEPT for the fact that the end of the buffer does not coincide with the end of the lines in the text file being read from. This results in what you see below. Keep in mind that because each line in the source file is potentially a different length, the break doesn't always occur in the same location in the line:

john likes to farm cattle
john likes to farm beetles
john likes to farm rabbits
john likes to farm carrots
john likes to farm b      <---1MB buffer ends here
ears                      <---new 1MB buffer begins here
john likes to farm antelope
john likes to farm rabies
john likes to farm lions

So is there a way to have a buffer of a specified size (1MB in this example), but only up to the end of the last line before 1MB is reached (so the buffer would most likely always be slightly less than 1MB in size)? I'm guessing part of that process would involve defining what exactly a line is (luckily I know how to do this now), but after that I don't know what I would need to do.

The only solution I can think of would be to go through after the contents of the buffer have been written to the file and search for incomplete lines and re-join them with the lines they were broken from. This seems really inefficient though.

edit: I forgot to include the format of the source file being read from:

john likes to farm cattle
john likes to farm beetles
john likes to farm rabbits
john likes to farm carrots
john likes to farm bears
john likes to farm antelope
john likes to farm rabies
john likes to farm lions

分享到QQ

分享到微博