在 C# 中创建随机文件
我正在创建一个指定大小的文件 - 我不关心其中有什么数据,尽管随机会很好。目前我正在这样做:
var sizeInMB = 3; // Up to many Gb
using (FileStream stream = new FileStream(fileName, FileMode.Create))
{
using (BinaryWriter writer = new BinaryWriter(stream))
{
while (writer.BaseStream.Length <= sizeInMB * 1000000)
{
writer.Write("a"); //This could be random. Also, larger strings improve performance obviously
}
writer.Close();
}
}
这不是有效的,甚至不是正确的方法。有更高性能的解决方案吗?
感谢所有的答案。
编辑
对 2Gb 文件的以下方法进行了一些测试(时间以毫秒为单位):
方法 1:Jon Skeet
byte[] data = new byte[sizeInMb * 1024 * 1024];
Random rng = new Random();
rng.NextBytes(data);
File.WriteAllBytes(fileName, data);
N/A - 2Gb 文件的内存不足异常
方法 2:Jon Skeet
byte[] data = new byte[8192];
Random rng = new Random();
using (FileStream stream = File.OpenWrite(fileName))
{
for (int i = 0; i < sizeInMB * 128; i++)
{
rng.NextBytes(data);
stream.Write(data, 0, data.Length);
}
}
@1K - 45,868, 23,283, 23,346
@128K - 24,877, 20,585, 20,716
@8Kb - 30,426, 22,936, 22,936
方法 3 - Hans Passant(超快但数据不是随机的)
using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
fs.SetLength(sizeInMB * 1024 * 1024);
}
257, 287, 3, 3, 2, 3 等。
I am creating a file of a specified size - I don't care what data is in it, although random would be nice. Currently I am doing this:
var sizeInMB = 3; // Up to many Gb
using (FileStream stream = new FileStream(fileName, FileMode.Create))
{
using (BinaryWriter writer = new BinaryWriter(stream))
{
while (writer.BaseStream.Length <= sizeInMB * 1000000)
{
writer.Write("a"); //This could be random. Also, larger strings improve performance obviously
}
writer.Close();
}
}
This isn't efficient or even the right way to go about it. Any higher performance solutions?
Thanks for all the answers.
Edit
Ran some tests on the following methods for a 2Gb File (time in ms):
Method 1: Jon Skeet
byte[] data = new byte[sizeInMb * 1024 * 1024];
Random rng = new Random();
rng.NextBytes(data);
File.WriteAllBytes(fileName, data);
N/A - Out of Memory Exception for 2Gb File
Method 2: Jon Skeet
byte[] data = new byte[8192];
Random rng = new Random();
using (FileStream stream = File.OpenWrite(fileName))
{
for (int i = 0; i < sizeInMB * 128; i++)
{
rng.NextBytes(data);
stream.Write(data, 0, data.Length);
}
}
@1K - 45,868, 23,283, 23,346
@128K - 24,877, 20,585, 20,716
@8Kb - 30,426, 22,936, 22,936
Method 3 - Hans Passant (Super Fast but data isn't random)
using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
fs.SetLength(sizeInMB * 1024 * 1024);
}
257, 287, 3, 3, 2, 3 etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
嗯,一个非常简单的解决方案:
一个内存效率更高的版本:)
但是,如果您非常连续快速地执行此操作几次,创建一个新的
Random实例
每次,您都可能会得到重复的数据。有关详细信息,请参阅我的有关随机性的文章 - 您可以使用System.Security.Cryptography.RandomNumberGenerator
...或者通过多次重复使用相同的Random
实例 - 但需要注意的是它不是线程安全的。Well, a very simple solution:
A slightly more memory efficient version :)
However, if you do this several times in very quick succession creating a new instance of
Random
each time, you may get duplicate data. See my article on randomness for more information - you could avoid this usingSystem.Security.Cryptography.RandomNumberGenerator
... or by reusing the same instance ofRandom
multiple times - with the caveat that it's not thread-safe.没有比利用 NTFS(硬盘上使用的 Windows 文件系统)内置的稀疏文件支持更快的方法了。此代码在不到一秒的时间内创建了一个 1 GB 的文件:
读取时,该文件仅包含零。
There's no faster way then taking advantage of the sparse file support built into NTFS, the file system for Windows used on hard disks. This code create a one gigabyte file in a fraction of a second:
When read, the file contains only zeros.
您可以使用我创建的以下类来生成随机字符串
以供使用
You can use this following class created by me for generate random strings
for using
创建大文件的有效方法:
但是该文件将为空(除了末尾的“测试”)。不清楚你到底想要做什么——包含数据的大文件,还是只是大文件。您也可以修改它以稀疏地在文件中写入一些数据,但不完全填满它。
如果您确实希望整个文件充满随机数据,那么我能想到的唯一方法是使用上面乔恩的随机字节。
The efficient way to create a large file:
However this file will be empty (except for the "test" at the end). Not clear what is it exactly you are trying to do -- large file with data, or just large file. You can modify this to sparsely write some data in the file too, but without filling it up completely.
If you do want the entire file filled with random data, then the only way I can think of is using Random bytes from Jon above.
一种改进是用数据填充所需大小的缓冲区并立即将其全部刷新。
An improvement would be to fill a buffer of the desired size with the data and flushing it all at once.