Base64转换后如何释放内存

发布于 2024-10-27 14:17:59 字数 2601 浏览 1 评论 0原文

我正在尝试流式传输文件的内容。该代码适用于较小的文件，但对于较大的文件，我会收到内存不足错误。

public void StreamEncode(FileStream inputStream, TextWriter tw)
{
    byte[] base64Block = new byte[BLOCK_SIZE];
    int bytesRead = 0;

    try
    {
        do
        {
            // read one block from the input stream
            bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);

            // encode the base64 string
            string base64String = Convert.ToBase64String(base64Block, 0, bytesRead);

            // write the string
            tw.Write(base64String);

        } while (bytesRead == base64Block.Length);
    }
    catch (OutOfMemoryException)
    {
        MessageBox.Show("Error -- Memory used: " + GC.GetTotalMemory(false) + " bytes");
    }
}

我可以隔离问题并观察使用的内存随着循环而增长。
问题似乎是对 Convert.ToBase64String() 的调用。

如何释放转换后的字符串的内存？

从这里开始编辑...这是更新。我还创建了一个关于此的新线程 - 抱歉，我我想这不是正确的做法。

感谢您的宝贵建议。根据建议，我缩小了用于读取文件的缓冲区大小，看起来内存消耗更好，但我仍然看到 OOM 问题，并且我在文件大小小至 5MB 时看到此问题。我可能想要处理十倍大的文件。

我现在的问题似乎是与 TextWriter 的使用有关。

我创建一个请求，如下所示[进行一些编辑以缩小代码]：

HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(new Uri(strURL));
oRequest.Method = httpMethod;
oRequest.ContentType = "application/atom+xml";
oRequest.Headers["Authorization"] = getAuthHeader();
oRequest.ContentLength = strHead.Length + strTail.Length + longContentSize;
oRequest.SendChunked = true;

using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
    tw.Write(strHead);
    using (FileStream fileStream = new FileStream(strPath, FileMode.Open, 
           FileAccess.Read, System.IO.FileShare.ReadWrite))
    {
        StreamEncode(fileStream, tw);
    }
    tw.Write(strTail);
}
.....

哪个调用例程：

public void StreamEncode(FileStream inputStream, TextWriter tw)
{
    // For Base64 there are 4 bytes output for every 3 bytes of input
    byte[] base64Block = new byte[9000];
    int bytesRead = 0;
    string base64String = null;

    do
    {
        // read one block from the input stream
        bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);

        // encode the base64 string
        base64String = Convert.ToBase64String(base64Block, 0, bytesRead);

        // write the string
        tw.Write(base64String);


    } while (bytesRead !=0 );

}

由于潜在的大内容，我是否应该使用 TextWriter 以外的其他内容？能够创建请求的整个有效负载似乎非常方便。

这是完全错误的做法吗？我希望能够支持非常大的文件。

原文

I am trying to stream the contents of a file.
The code works for smaller files, but with larger files, I get an Out of Memory error.

public void StreamEncode(FileStream inputStream, TextWriter tw)
{
    byte[] base64Block = new byte[BLOCK_SIZE];
    int bytesRead = 0;

    try
    {
        do
        {
            // read one block from the input stream
            bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);

            // encode the base64 string
            string base64String = Convert.ToBase64String(base64Block, 0, bytesRead);

            // write the string
            tw.Write(base64String);

        } while (bytesRead == base64Block.Length);
    }
    catch (OutOfMemoryException)
    {
        MessageBox.Show("Error -- Memory used: " + GC.GetTotalMemory(false) + " bytes");
    }
}

I can isolate the problem and watch the memory used grow as it loops.
The problem seems to be the call to Convert.ToBase64String().

How can I free the memory for the converted string?

Edited from here down ... Here is an update.
I also created a new thread about this -- sorry I guess that was not the right thing to do.

Thanks for your great suggestions. From the suggestions, I shrunk the buffer size used to read from the file, and it looks like memory consumption is better, but I'm still seeing an OOM problem, and I'm seeing this problem with files sizes as small as 5MB. I potentially want to deal with files ten times larger.

My problem seems now to be with the use of TextWriter.

I create a request as follows [with a few edits to shrink the code]:

HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(new Uri(strURL));
oRequest.Method = httpMethod;
oRequest.ContentType = "application/atom+xml";
oRequest.Headers["Authorization"] = getAuthHeader();
oRequest.ContentLength = strHead.Length + strTail.Length + longContentSize;
oRequest.SendChunked = true;

using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
    tw.Write(strHead);
    using (FileStream fileStream = new FileStream(strPath, FileMode.Open, 
           FileAccess.Read, System.IO.FileShare.ReadWrite))
    {
        StreamEncode(fileStream, tw);
    }
    tw.Write(strTail);
}
.....

Which calls into the routine:

public void StreamEncode(FileStream inputStream, TextWriter tw)
{
    // For Base64 there are 4 bytes output for every 3 bytes of input
    byte[] base64Block = new byte[9000];
    int bytesRead = 0;
    string base64String = null;

    do
    {
        // read one block from the input stream
        bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);

        // encode the base64 string
        base64String = Convert.ToBase64String(base64Block, 0, bytesRead);

        // write the string
        tw.Write(base64String);


    } while (bytesRead !=0 );

}

Should I use something other than TextWriter because of the potential large content? It seems very convenient for being able to create the whole payload of the request.

Is this totally the wrong approach? I want to be able to support very large files.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

旧街凉风 2024-11-03 14:17:59

如果您使用 32 kB 或更大的 BLOCK_SIZE，您将创建 85 kB 或更大的字符串，这些字符串分配在大对象堆上。短期对象应该存在于常规堆中，而不是大对象堆中，因此这可能是内存问题的原因。

另外，我发现代码存在两个潜在问题：

base64 编码在字符串末尾使用填充，因此如果将流切碎为位并转换为 Base64 字符串，然后将字符串写入流，您最终不会得到单个 Base64 流。
检查使用Read方法读取的字节数是否与请求的字节数相同并不是检查流结尾的正确方法。 Read 方法可能会在任何时候读取的字节数少于请求的字节数，检查流结束的正确方法是当该方法返回零时。

回复收藏 0 原文

赢得她心 2024-11-03 14:17:59

请记住，将数据转换为 base64 时，生成的字符串将长 33%（假设输入大小是 3 的倍数，这在您的情况下可能是个好主意）。如果 BLOCK_SIZE 太大，则可能没有足够的连续内存来保存生成的 Base-64 字符串。

尝试减小BLOCK_SIZE，使base-64的每一块都更小，从而更容易为其分配内存。

但是，如果您使用像 StringWriter 这样的内存中 TextWriter，您可能会遇到同样的问题，因为它无法找到足够大的内存块来容纳内部缓冲区。不过，如果您要写入文件之类的内容，这应该不是问题。

回复收藏 0 原文

逆夏时光 2024-11-03 14:17:59

疯狂猜测...HttpWebRequest.AllowWriteStreamBuffering 是默认情况下为 true，并且根据 MSDN“将 AllowWriteStreamBuffering 设置为 true 可能会在上传大型数据集时导致性能问题，因为数据缓冲区可能会使用所有可用内存”。尝试设置
oRequest.AllowWriteStreamBuffering = false
看看会发生什么。

回复收藏 0 原文

烟柳画桥 2024-11-03 14:17:59

尝试将您的 base64String 声明从循环中拉出。如果这仍然没有帮助，请尝试在多次迭代后调用垃圾收集器。

GC.Collect();
GC.WaitForPendingFinalizers();

回复收藏 0 原文

奶气 2024-11-03 14:17:59

尝试减小块大小或避免将 Convert 调用的结果分配给变量：

bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
tw.Write(Convert.ToBase64String(base64Block, 0, bytesRead));

Try reducing the block size or avoid assigning the result of the Convert call to a variable:

bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
tw.Write(Convert.ToBase64String(base64Block, 0, bytesRead));

回复收藏 0 原文

清引 2024-11-03 14:17:59

从内存使用的角度来看，代码看起来没问题，但我认为您正在传递基于内存的流（如 MemoryStream）的 writer 并在那里存储数据会导致 OOM 异常。

如果 BLOCK_SIZE 高于 86Kb，则大对象堆 (LOH) 上将发生分配，它将改变分配行为，但本身不应导致 OOM。

注意：您的结束条件不正确 - 应该是 bytesRead != 0，一般来说，即使还有更多数据，Read 也可以返回比要求的字节少的字节。据我所知，FileStream 也从未这样做过。

回复收藏 0 原文

烟燃烟灭 2024-11-03 14:17:59

我首先将结果写入临时文件。

using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
    tw.Write(strHead);
    var tempPath = Path.GetTempFileName();
    try
    {
        using (var input = File.OpenRead(strPath))
        using (var output = File.Open(
            tempPath, FileMode.Open, FileAccess.ReadWrite))
        {
            StreamEncode(fileStream, output);
            output.Seek(0, SeekOrigin.Begin);
            CopyTo(output, ((StreamWriter)tw).BaseStream);
        }
    }
    finally
    {
        File.Delete(tempPath);
    }
    tw.Write(strTail);
}

public void StreamEncode(Stream inputStream, Stream output)
{
    // For Base64 there are 4 bytes output for every 3 bytes of input
    byte[] base64Block = new byte[9000];
    int bytesRead = 0;
    string base64String = null;

    using (var tw = new StreamWriter(output))
    {
        do
        {
            // read one block from the input stream
            bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);

            // encode the base64 string
            base64String = Convert.ToBase64String(base64Block, 0, bytesRead);

            // write the string
            tw.Write(base64String);

        } while (bytesRead !=0 );
    }

}


static void CopyTo(Stream input, Stream output)
{
    const int length = 10240;
    byte[] buffer = new byte[length];
    int count = 0;

    while ((count = input.Read(buffer, 0, length)) > 0)
        output.Write(buffer, 0, count);
}

I would write the result to a temp file first.

using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
    tw.Write(strHead);
    var tempPath = Path.GetTempFileName();
    try
    {
        using (var input = File.OpenRead(strPath))
        using (var output = File.Open(
            tempPath, FileMode.Open, FileAccess.ReadWrite))
        {
            StreamEncode(fileStream, output);
            output.Seek(0, SeekOrigin.Begin);
            CopyTo(output, ((StreamWriter)tw).BaseStream);
        }
    }
    finally
    {
        File.Delete(tempPath);
    }
    tw.Write(strTail);
}

public void StreamEncode(Stream inputStream, Stream output)
{
    // For Base64 there are 4 bytes output for every 3 bytes of input
    byte[] base64Block = new byte[9000];
    int bytesRead = 0;
    string base64String = null;

    using (var tw = new StreamWriter(output))
    {
        do
        {
            // read one block from the input stream
            bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);

            // encode the base64 string
            base64String = Convert.ToBase64String(base64Block, 0, bytesRead);

            // write the string
            tw.Write(base64String);

        } while (bytesRead !=0 );
    }

}


static void CopyTo(Stream input, Stream output)
{
    const int length = 10240;
    byte[] buffer = new byte[length];
    int count = 0;

    while ((count = input.Read(buffer, 0, length)) > 0)
        output.Write(buffer, 0, count);
}

回复收藏 0 原文

~没有更多了~