Base64转换后如何释放内存
我正在尝试流式传输文件的内容。 该代码适用于较小的文件,但对于较大的文件,我会收到内存不足错误。
public void StreamEncode(FileStream inputStream, TextWriter tw)
{
byte[] base64Block = new byte[BLOCK_SIZE];
int bytesRead = 0;
try
{
do
{
// read one block from the input stream
bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
// encode the base64 string
string base64String = Convert.ToBase64String(base64Block, 0, bytesRead);
// write the string
tw.Write(base64String);
} while (bytesRead == base64Block.Length);
}
catch (OutOfMemoryException)
{
MessageBox.Show("Error -- Memory used: " + GC.GetTotalMemory(false) + " bytes");
}
}
我可以隔离问题并观察使用的内存随着循环而增长。
问题似乎是对 Convert.ToBase64String()
的调用。
如何释放转换后的字符串的内存?
从这里开始编辑...这是更新。 我还创建了一个关于此的新 线程 - 抱歉,我我想这不是正确的做法。
感谢您的宝贵建议。根据建议,我缩小了用于读取文件的缓冲区大小,看起来内存消耗更好,但我仍然看到 OOM 问题,并且我在文件大小小至 5MB 时看到此问题。我可能想要处理十倍大的文件。
我现在的问题似乎是与 TextWriter 的使用有关。
我创建一个请求,如下所示[进行一些编辑以缩小代码]:
HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(new Uri(strURL));
oRequest.Method = httpMethod;
oRequest.ContentType = "application/atom+xml";
oRequest.Headers["Authorization"] = getAuthHeader();
oRequest.ContentLength = strHead.Length + strTail.Length + longContentSize;
oRequest.SendChunked = true;
using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
tw.Write(strHead);
using (FileStream fileStream = new FileStream(strPath, FileMode.Open,
FileAccess.Read, System.IO.FileShare.ReadWrite))
{
StreamEncode(fileStream, tw);
}
tw.Write(strTail);
}
.....
哪个调用例程:
public void StreamEncode(FileStream inputStream, TextWriter tw)
{
// For Base64 there are 4 bytes output for every 3 bytes of input
byte[] base64Block = new byte[9000];
int bytesRead = 0;
string base64String = null;
do
{
// read one block from the input stream
bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
// encode the base64 string
base64String = Convert.ToBase64String(base64Block, 0, bytesRead);
// write the string
tw.Write(base64String);
} while (bytesRead !=0 );
}
由于潜在的大内容,我是否应该使用 TextWriter 以外的其他内容?能够创建请求的整个有效负载似乎非常方便。
这是完全错误的做法吗?我希望能够支持非常大的文件。
I am trying to stream the contents of a file.
The code works for smaller files, but with larger files, I get an Out of Memory error.
public void StreamEncode(FileStream inputStream, TextWriter tw)
{
byte[] base64Block = new byte[BLOCK_SIZE];
int bytesRead = 0;
try
{
do
{
// read one block from the input stream
bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
// encode the base64 string
string base64String = Convert.ToBase64String(base64Block, 0, bytesRead);
// write the string
tw.Write(base64String);
} while (bytesRead == base64Block.Length);
}
catch (OutOfMemoryException)
{
MessageBox.Show("Error -- Memory used: " + GC.GetTotalMemory(false) + " bytes");
}
}
I can isolate the problem and watch the memory used grow as it loops.
The problem seems to be the call to Convert.ToBase64String()
.
How can I free the memory for the converted string?
Edited from here down ... Here is an update.
I also created a new thread about this -- sorry I guess that was not the right thing to do.
Thanks for your great suggestions. From the suggestions, I shrunk the buffer size used to read from the file, and it looks like memory consumption is better, but I'm still seeing an OOM problem, and I'm seeing this problem with files sizes as small as 5MB. I potentially want to deal with files ten times larger.
My problem seems now to be with the use of TextWriter.
I create a request as follows [with a few edits to shrink the code]:
HttpWebRequest oRequest = (HttpWebRequest)WebRequest.Create(new Uri(strURL));
oRequest.Method = httpMethod;
oRequest.ContentType = "application/atom+xml";
oRequest.Headers["Authorization"] = getAuthHeader();
oRequest.ContentLength = strHead.Length + strTail.Length + longContentSize;
oRequest.SendChunked = true;
using (TextWriter tw = new StreamWriter(oRequest.GetRequestStream()))
{
tw.Write(strHead);
using (FileStream fileStream = new FileStream(strPath, FileMode.Open,
FileAccess.Read, System.IO.FileShare.ReadWrite))
{
StreamEncode(fileStream, tw);
}
tw.Write(strTail);
}
.....
Which calls into the routine:
public void StreamEncode(FileStream inputStream, TextWriter tw)
{
// For Base64 there are 4 bytes output for every 3 bytes of input
byte[] base64Block = new byte[9000];
int bytesRead = 0;
string base64String = null;
do
{
// read one block from the input stream
bytesRead = inputStream.Read(base64Block, 0, base64Block.Length);
// encode the base64 string
base64String = Convert.ToBase64String(base64Block, 0, bytesRead);
// write the string
tw.Write(base64String);
} while (bytesRead !=0 );
}
Should I use something other than TextWriter because of the potential large content? It seems very convenient for being able to create the whole payload of the request.
Is this totally the wrong approach? I want to be able to support very large files.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
如果您使用 32 kB 或更大的
BLOCK_SIZE
,您将创建 85 kB 或更大的字符串,这些字符串分配在大对象堆上。短期对象应该存在于常规堆中,而不是大对象堆中,因此这可能是内存问题的原因。另外,我发现代码存在两个潜在问题:
base64 编码在字符串末尾使用填充,因此如果将流切碎为位并转换为 Base64 字符串,然后将字符串写入流,您最终不会得到单个 Base64 流。
检查使用
Read
方法读取的字节数是否与请求的字节数相同并不是检查流结尾的正确方法。Read
方法可能会在任何时候读取的字节数少于请求的字节数,检查流结束的正确方法是当该方法返回零时。If you use a
BLOCK_SIZE
that is 32 kB or more, you will be creating strings that are 85 kB or more, which are allocated on the large objects heap. Short lived objects should live in the regular heaps, not the large objects heap, so that may be the reason for the memory problems.Also, I see two potential problems with the code:
The base64 encoding uses padding at the end of the string, so if you chop up a stream into bits and convert to base64 strings, and then write the strings to a stream, you don't end up with a single base64 stream.
Checking if the number of bytes read using the
Read
method is the same as the number of requested bytes is not the proper way of checking for the end of the stream. TheRead
method may read less bytes than requested any time it feels like it, and the correct way to check for the end of the stream is when the method returns zero.请记住,将数据转换为 base64 时,生成的字符串将长 33%(假设输入大小是 3 的倍数,这在您的情况下可能是个好主意)。如果 BLOCK_SIZE 太大,则可能没有足够的连续内存来保存生成的 Base-64 字符串。
尝试减小BLOCK_SIZE,使base-64的每一块都更小,从而更容易为其分配内存。
但是,如果您使用像 StringWriter 这样的内存中 TextWriter,您可能会遇到同样的问题,因为它无法找到足够大的内存块来容纳内部缓冲区。不过,如果您要写入文件之类的内容,这应该不是问题。
Keep in mind that when converting data to base64, the resulting string will be 33% longer (assuming the input size is a multiple of 3, which is probably a good idea in your case). If BLOCK_SIZE is too large there might not be enough contiguous memory to hold the resulting base-64 string.
Try reducing BLOCK_SIZE, so that each piece of the base-64 is smaller, making it easier to allocate the memory for it.
However, if you're using an in-memory TextWriter like a StringWriter, you may run into the same problem, because it would fail to find a block of memory large enough to hold the internal buffer. If you're writing to something like a file, this should not be a problem, though.
疯狂猜测...HttpWebRequest.AllowWriteStreamBuffering 是默认情况下为 true,并且根据 MSDN“将 AllowWriteStreamBuffering 设置为 true 可能会在上传大型数据集时导致性能问题,因为数据缓冲区可能会使用所有可用内存”。尝试设置
oRequest.AllowWriteStreamBuffering = false
看看会发生什么。
Wild guess...HttpWebRequest.AllowWriteStreamBuffering is by default true, and according to MSDN "setting AllowWriteStreamBuffering to true might cause performance problems when uploading large datasets because the data buffer could use all available memory". Try setting
oRequest.AllowWriteStreamBuffering = false
and see what happens.
尝试将您的 base64String 声明从循环中拉出。如果这仍然没有帮助,请尝试在多次迭代后调用垃圾收集器。
GC.Collect();
GC.WaitForPendingFinalizers();
Try pulling your base64String declaration out of the loop. If that still doesn't help, try calling the garbage collector after so many iterations.
GC.Collect();
GC.WaitForPendingFinalizers();
尝试减小块大小或避免将 Convert 调用的结果分配给变量:
Try reducing the block size or avoid assigning the result of the Convert call to a variable:
从内存使用的角度来看,代码看起来没问题,但我认为您正在传递基于内存的流(如 MemoryStream)的 writer 并在那里存储数据会导致 OOM 异常。
如果 BLOCK_SIZE 高于 86Kb,则大对象堆 (LOH) 上将发生分配,它将改变分配行为,但本身不应导致 OOM。
注意:您的结束条件不正确 - 应该是 bytesRead != 0,一般来说,即使还有更多数据,Read 也可以返回比要求的字节少的字节。据我所知,FileStream 也从未这样做过。
Code looks ok from memory usage point of view, but I think you are passing writer for Memory-based stream (like MemoryStream) and storing data there causes OOM exception.
If BLOCK_SIZE is above 86Kb allocations will happen on Large Objects Heap (LOH), it will change behavior of allocations, but should not cause OOM by itself.
Note: your end condition is not correct - should be bytesRead != 0, in genral Read can return less bytes than asked even if there are more data left. Also FileStream is never doing it to my knowledge.
我首先将结果写入临时文件。
I would write the result to a temp file first.