Memorystream 和大对象堆
我必须使用 WCF 通过不可靠的连接在计算机之间传输大文件。
因为我希望能够恢复该文件,并且不想受到 WCF 的文件大小限制,所以我将文件分成 1MB 的块。这些“块”作为流传输。到目前为止,效果非常好。
我的步骤是:
- 打开文件流
- 将文件中的块读取到字节[]中并创建内存流
- 将块传输
- 回2。直到发送整个文件
我的问题在步骤2中。我假设当我从字节数组创建内存流时,它最终会出现在 LOH 上并最终导致内存不足异常。我实际上无法创建这个错误,也许我的假设是错误的。
现在,我不想发送消息中的 byte[],因为 WCF 会告诉我数组大小太大。我可以更改允许的最大数组大小和/或块的大小,但我希望有另一个解决方案。
我的实际问题:
- 我当前的解决方案是否会在 LOH 上创建对象,这会给我带来问题吗?
- 有更好的方法来解决这个问题吗?
顺便说一句:在接收端,我简单地从到达的流中读取较小的块并将它们直接写入文件,因此不涉及大字节数组。
编辑:
当前解决方案:
for (int i = resumeChunk; i < chunks; i++)
{
byte[] buffer = new byte[chunkSize];
fileStream.Position = i * chunkSize;
int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
Array.Resize(ref buffer, actualLength);
using (MemoryStream stream = new MemoryStream(buffer))
{
UploadFile(stream);
}
}
I have to transfer large files between computers on via unreliable connections using WCF.
Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. These "chunk" are transported as stream. Which works quite nice, so far.
My steps are:
- open filestream
- read chunk from file into byte[] and create memorystream
- transfer chunk
- back to 2. until the whole file is sent
My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. I could not actually create this error, maybe I am wrong in my assumption.
Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution.
My actual question(s):
- Will my current solution create objects on the LOH and will that cause me problem?
- Is there a better way to solve this?
Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved.
Edit:
current solution:
for (int i = resumeChunk; i < chunks; i++)
{
byte[] buffer = new byte[chunkSize];
fileStream.Position = i * chunkSize;
int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
Array.Resize(ref buffer, actualLength);
using (MemoryStream stream = new MemoryStream(buffer))
{
UploadFile(stream);
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我希望这没问题。这是我在 StackOverflow 上的第一个回答。
是的,如果您的块大小超过 85000 字节,那么该数组将在大对象堆上分配。当您分配和释放大小相同的连续内存区域时,您可能不会很快耗尽内存,因此当内存填满时,运行时可以将新块放入旧的回收内存区域中。
我会有点担心 Array.Resize 调用,因为这将创建另一个数组(请参阅 http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx)。如果实际长度==块大小,则这是不必要的步骤,因为除了最后一个块之外的所有块都将如此。所以我至少建议:
这应该删除很多分配。如果actualSize与chunkSize不同但仍然>; 85000 那么新数组也将被分配在大对象堆上,可能会导致其碎片并可能导致明显的内存泄漏。我相信仍然需要很长时间才能真正耗尽内存,因为泄漏会非常慢。
我认为更好的实现是使用某种缓冲池来提供数组。您可以自己推出(这太复杂了),但 WCF 确实为您提供了一个。我稍微重写了您的代码以利用这一点:
这假设 UploadFile 的实现可以重写为采用 int 作为 no。要写入的字节数。
我希望这对
乔有帮助
I hope this is okay. It's my first answer on StackOverflow.
Yes absolutely if your chunksize is over 85000 bytes then the array will get allocated on the large object heap. You will probably not run out of memory very quickly as you are allocating and deallocating contiguous areas of memory that are all the same size so when memory fills up the runtime can fit a new chunk into an old, reclaimed memory area.
I would be a little worried about the Array.Resize call as that will create another array (see http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx). This is an unecessary step if actualLength==Chunksize as it will be for all but the last chunk. So I would as a minimum suggest:
This should remove a lot of allocations. If the actualSize is not the same as the chunkSize but is still > 85000 then the new array will also be allocated on the Large object heap potentially causing it to fragment and possibly causing apparent memory leaks. It would I believe still take a long time to actually run out of memory as the leak would be quite slow.
I think a better implementation would be to use some kind of Buffer Pool to provide the arrays. You could roll your own (it would be too complicated) but WCF does provide one for you. I have rewritten your code slightly to take advatage of that:
this assumes that the implementation of UploadFile Can be rewritten to take an int for the no. of bytes to write.
I hope this helps
joe
另请参阅RecyclableMemoryStream。
来自本文:
Microsoft.IO.RecyclableMemoryStream 是 MemoryStream 的替代品,可为性能关键型系统提供卓越的行为。特别是,它经过优化,可以执行以下操作:
See also RecyclableMemoryStream.
From this article:
Microsoft.IO.RecyclableMemoryStream is a MemoryStream replacement that offers superior behavior for performance-critical systems. In particular it is optimized to do the following:
我不太确定你的问题的第一部分,但至于更好的方法 - 你是否考虑过 位?它允许通过 http 后台下载文件。您可以为其提供 http:// 或 file:// URI。它可以从中断点恢复,并使用 http HEADER 中的 RANGE 方法以字节块的形式下载。它由 Windows Update 使用。您可以订阅提供有关进度和完成情况的信息的事件。
I'm not so sure about the first part of your question but as for a better way - have you considered BITS? It allows background downloading of files over http. You can provide it a http:// or file:// URI. It is resumable from the point that it was interrupted and downloads in chunks of bytes using the RANGE method in the http HEADER. It is used by Windows Update.You can subscribe to events that give information on progress and completion.
我为此提出了另一种解决方案,请告诉我您的想法!
由于我不想在内存中保存大量数据,因此我一直在寻找一种优雅的方法来临时存储字节数组或流。
这个想法是创建一个临时文件(您不需要特定的权限来执行此操作),然后像内存流一样使用它。将类设置为 Disposable 将在使用临时文件后清理该文件。
...
I have come up with another solution for this, let me know what you think!
Since I don't want to have large amounts of data in the memory I was looking for an elegant way to temporary store byte arrays or a stream.
The idea is to create a temp file (you don't need specific rights to do this) and then use it similar to a memory stream. Making the class Disposable will clean up the temp file after it has been used.
...