读取 Stream 中的多个文件

发布于 2025-01-03 19:37:03 字数 1036 浏览 1 评论 0原文

嘿!

如何一次读取多个文本文件? 我想要做的是读取一系列文件并将它们全部附加到一个大文件中。目前我正在这样做:

  1. 获取每个文件并使用 StreamReader 打开它
  2. 在 StringBuilder 中完全读取 StreamReader 并将其附加到当前 StreamBuilder
  3. 检查是否超出内存大小,如果超出则将 StringBuilder 写入文件末尾并清空 不幸的是, StrigBuilder 的

读取速度平均只有 4MB/秒。我注意到,当我在磁盘上移动文件时,速度为 40 MB/秒。 我正在考虑缓冲流中的文件并像写入一样一次性读取它们。知道我怎样才能实现这个目标吗?

更新:

 foreach (string file in System.IO.Directory.GetFiles(InputPath))
        {
            using (StreamReader sr = new StreamReader(file))
            {

                try
                {
                    txt = txt+(file + "|" + sr.ReadToEnd());
                }
                catch // out of memory exception 
                {
                    WriteString(outputPath + "\\" + textBox3.Text, ref txt);
                    //sb = new StringBuilder(file + "|" + sr.ReadToEnd());
                    txt = file + "|" + sr.ReadToEnd();
                }

            }

            Application.DoEvents();
        }

这就是我现在正在做的。

Hei!

How can I read multiple text files at once?
What I want to do is read a series of files and append all of them to one big file. Curently I am doing this:

  1. take each file and open it with a StreamReader
  2. read the StreamReader completely in a StringBuilder and append it to the current StreamBuilder
  3. check if the memory size is exceeded and if yes write the StringBuilder at the end of the file and empty the StrigBuilder

Unfortunately, I observed that the reading speed avg is only 4MB/sec. I noticed that when I move files around the disk I get a speed of 40 MB/sec.
I am thinking of buffering the files in a Stream and reading them all at once as I do with the writting. Any idea how can I achieve this?

Update:

 foreach (string file in System.IO.Directory.GetFiles(InputPath))
        {
            using (StreamReader sr = new StreamReader(file))
            {

                try
                {
                    txt = txt+(file + "|" + sr.ReadToEnd());
                }
                catch // out of memory exception 
                {
                    WriteString(outputPath + "\\" + textBox3.Text, ref txt);
                    //sb = new StringBuilder(file + "|" + sr.ReadToEnd());
                    txt = file + "|" + sr.ReadToEnd();
                }

            }

            Application.DoEvents();
        }

This is how I'm doing it now.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

转角预定愛 2025-01-10 19:37:03

一方面,您需要区分(二进制数据)和StreamReader或更一般的TextReader(文本数据)。

听起来您想创建一个 TextReader 的子类,它将接受(在其构造函数中)一堆 TextReader 参数。您不需要在这里急切地阅读任何内容...但是在您重写的Read方法中,您应该从“当前”阅读器中读取内容,直到读完为止,然后从下一个开始。请记住,Read 没有来填充它所给出的缓冲区 - 所以你可以这样做:

while (true)
{
    int charsRead = currentReader.Read(buffer, index, size);
    if (charsRead != 0)
    {
        return charsRead;
    }
    // Adjust this based on how you store the readers...
    if (readerQueue.Count == 0)
    {
        return 0;
    }
    currentReader = readerQueue.Dequeue();
}

我强烈怀疑已经有第三方库可以做到这一点有点解复用,请注意...

For one thing, you need to differentiate between streams (binary data) and StreamReaders or more generally TextReaders (text data).

It sounds like you want to create a subclass of TextReader which will accept (in its constructor) a bunch of TextReader parameters. You don't need to eagerly read anything here... but in the Read methods that you override, you should read from "the current" reader until that's exhausted, then start on the next one. Bear in mind that Read doesn't have to fill the buffer it's been given - so you could do something like:

while (true)
{
    int charsRead = currentReader.Read(buffer, index, size);
    if (charsRead != 0)
    {
        return charsRead;
    }
    // Adjust this based on how you store the readers...
    if (readerQueue.Count == 0)
    {
        return 0;
    }
    currentReader = readerQueue.Dequeue();
}

I strongly suspect there are already third party libraries to do this sort of demuxing, mind you...

折戟 2025-01-10 19:37:03

如果您所做的只是读取文件,然后将它们连接到磁盘上的新文件,那么您可能根本不需要编写代码。使用 Windows 复制命令:

C:\> copy a.txt+b.txt+c.txt+d.txt output.txt

如果需要,您可以通过 Process.Start 调用此命令。

当然,这是假设您没有对文件或其内容执行任何自定义逻辑。

If all you're doing is reading files and then concatenating them together to a new file on disk, you might not need to write code at all. Use the Windows copy command:

C:\> copy a.txt+b.txt+c.txt+d.txt output.txt

You can call this via Process.Start if you want.

This, of course, assumes that you're not doing any custom logic on the files or their content.

囚我心虐我身 2025-01-10 19:37:03

这应该很快(但它会将整个文件加载到内存中,因此可能无法满足所有需求):

string[] files = { @"c:\a.txt", @"c:\b.txt", @"c:\c.txt" };

FileStream outputFile = new FileStream(@"C:\d.txt", FileMode.Create);

using (BinaryWriter ws = new BinaryWriter(outputFile))
{
    foreach (string file in files)
    {
        ws.Write(System.IO.File.ReadAllBytes(file));
    }
}

This should be fast (but it'll load the entire files in memory, so might not fit with every need):

string[] files = { @"c:\a.txt", @"c:\b.txt", @"c:\c.txt" };

FileStream outputFile = new FileStream(@"C:\d.txt", FileMode.Create);

using (BinaryWriter ws = new BinaryWriter(outputFile))
{
    foreach (string file in files)
    {
        ws.Write(System.IO.File.ReadAllBytes(file));
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文