在 C# 中创建随机文件

发布于 2024-10-08 01:50:22 字数 1548 浏览 10 评论 0原文

我正在创建一个指定大小的文件 - 我不关心其中有什么数据,尽管随机会很好。目前我正在这样做:

        var sizeInMB = 3; // Up to many Gb
        using (FileStream stream = new FileStream(fileName, FileMode.Create))
        {
            using (BinaryWriter writer = new BinaryWriter(stream))
            {
                while (writer.BaseStream.Length <= sizeInMB * 1000000)
                {
                    writer.Write("a"); //This could be random. Also, larger strings improve performance obviously
                }
                writer.Close();
            }
        }

这不是有效的,甚至不是正确的方法。有更高性能的解决方案吗?

感谢所有的答案。

编辑

对 2Gb 文件的以下方法进行了一些测试(时间以毫秒为单位):

方法 1:Jon Skeet

byte[] data = new byte[sizeInMb * 1024 * 1024];
Random rng = new Random();
rng.NextBytes(data);
File.WriteAllBytes(fileName, data);

N/A - 2Gb 文件的内存不足异常

方法 2:Jon Skeet

byte[] data = new byte[8192];
Random rng = new Random();
using (FileStream stream = File.OpenWrite(fileName))
{
    for (int i = 0; i < sizeInMB * 128; i++)
    {
         rng.NextBytes(data);
         stream.Write(data, 0, data.Length);
    }
}

@1K - 45,868, 23,283, 23,346

@128K - 24,877, 20,585, 20,716

@8Kb - 30,426, 22,936, 22,936

方法 3 - Hans Passant(超快但数据不是随机的)

using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
    fs.SetLength(sizeInMB * 1024 * 1024);
}

257, 287, 3, 3, 2, 3 等。

I am creating a file of a specified size - I don't care what data is in it, although random would be nice. Currently I am doing this:

        var sizeInMB = 3; // Up to many Gb
        using (FileStream stream = new FileStream(fileName, FileMode.Create))
        {
            using (BinaryWriter writer = new BinaryWriter(stream))
            {
                while (writer.BaseStream.Length <= sizeInMB * 1000000)
                {
                    writer.Write("a"); //This could be random. Also, larger strings improve performance obviously
                }
                writer.Close();
            }
        }

This isn't efficient or even the right way to go about it. Any higher performance solutions?

Thanks for all the answers.

Edit

Ran some tests on the following methods for a 2Gb File (time in ms):

Method 1: Jon Skeet

byte[] data = new byte[sizeInMb * 1024 * 1024];
Random rng = new Random();
rng.NextBytes(data);
File.WriteAllBytes(fileName, data);

N/A - Out of Memory Exception for 2Gb File

Method 2: Jon Skeet

byte[] data = new byte[8192];
Random rng = new Random();
using (FileStream stream = File.OpenWrite(fileName))
{
    for (int i = 0; i < sizeInMB * 128; i++)
    {
         rng.NextBytes(data);
         stream.Write(data, 0, data.Length);
    }
}

@1K - 45,868, 23,283, 23,346

@128K - 24,877, 20,585, 20,716

@8Kb - 30,426, 22,936, 22,936

Method 3 - Hans Passant (Super Fast but data isn't random)

using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write, FileShare.None))
{
    fs.SetLength(sizeInMB * 1024 * 1024);
}

257, 287, 3, 3, 2, 3 etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

恋你朝朝暮暮 2024-10-15 01:50:22

嗯,一个非常简单的解决方案:

byte[] data = new byte[sizeInMb * 1024 * 1024];
Random rng = new Random();
rng.NextBytes(data);
File.WriteAllBytes(fileName, data);

一个内存效率更高的版本:)

// Note: block size must be a factor of 1MB to avoid rounding errors :)
const int blockSize = 1024 * 8;
const int blocksPerMb = (1024 * 1024) / blockSize;
byte[] data = new byte[blockSize];
Random rng = new Random();
using (FileStream stream = File.OpenWrite(fileName))
{
    // There 
    for (int i = 0; i < sizeInMb * blocksPerMb; i++)
    {
        rng.NextBytes(data);
        stream.Write(data, 0, data.Length);
    }
}

但是,如果您非常连续快速地执行此操作几次,创建一个新的Random实例 每次,您都可能会得到重复的数据。有关详细信息,请参阅我的有关随机性的文章 - 您可以使用 System.Security.Cryptography.RandomNumberGenerator ...或者通过多次重复使用相同的 Random 实例 - 但需要注意的是它不是线程安全的。

Well, a very simple solution:

byte[] data = new byte[sizeInMb * 1024 * 1024];
Random rng = new Random();
rng.NextBytes(data);
File.WriteAllBytes(fileName, data);

A slightly more memory efficient version :)

// Note: block size must be a factor of 1MB to avoid rounding errors :)
const int blockSize = 1024 * 8;
const int blocksPerMb = (1024 * 1024) / blockSize;
byte[] data = new byte[blockSize];
Random rng = new Random();
using (FileStream stream = File.OpenWrite(fileName))
{
    // There 
    for (int i = 0; i < sizeInMb * blocksPerMb; i++)
    {
        rng.NextBytes(data);
        stream.Write(data, 0, data.Length);
    }
}

However, if you do this several times in very quick succession creating a new instance of Random each time, you may get duplicate data. See my article on randomness for more information - you could avoid this using System.Security.Cryptography.RandomNumberGenerator... or by reusing the same instance of Random multiple times - with the caveat that it's not thread-safe.

著墨染雨君画夕 2024-10-15 01:50:22

没有比利用 NTFS(硬盘上使用的 Windows 文件系统)内置的稀疏文件支持更快的方法了。此代码在不到一秒的时间内创建了一个 1 GB 的文件:

using System;
using System.IO;

class Program {
    static void Main(string[] args) {
        using (var fs = new FileStream(@"c:\temp\onegigabyte.bin", FileMode.Create, FileAccess.Write, FileShare.None)) {
            fs.SetLength(1024 * 1024 * 1024);
        }
    }
}

读取时,该文件仅包含零。

There's no faster way then taking advantage of the sparse file support built into NTFS, the file system for Windows used on hard disks. This code create a one gigabyte file in a fraction of a second:

using System;
using System.IO;

class Program {
    static void Main(string[] args) {
        using (var fs = new FileStream(@"c:\temp\onegigabyte.bin", FileMode.Create, FileAccess.Write, FileShare.None)) {
            fs.SetLength(1024 * 1024 * 1024);
        }
    }
}

When read, the file contains only zeros.

淡淡的优雅 2024-10-15 01:50:22

您可以使用我创建的以下类来生成随机字符串

using System;
using System.Text;

public class RandomStringGenerator
{
    readonly Random random;

    public RandomStringGenerator()
    {
        random = new Random();
    }
    public string Generate(int length)
    {
        if (length < 0)
        {
            throw new ArgumentOutOfRangeException("length");
        }
        var stringBuilder = new StringBuilder();

        for (int i = 0; i < length; i++)
        {
            char ch = (char)random.Next(0,255 );
            stringBuilder.Append(ch);
        }

        return stringBuilder.ToString();

    }

}

以供使用

 int length = 10;
        string randomString = randomStringGenerator.Generate(length);

You can use this following class created by me for generate random strings

using System;
using System.Text;

public class RandomStringGenerator
{
    readonly Random random;

    public RandomStringGenerator()
    {
        random = new Random();
    }
    public string Generate(int length)
    {
        if (length < 0)
        {
            throw new ArgumentOutOfRangeException("length");
        }
        var stringBuilder = new StringBuilder();

        for (int i = 0; i < length; i++)
        {
            char ch = (char)random.Next(0,255 );
            stringBuilder.Append(ch);
        }

        return stringBuilder.ToString();

    }

}

for using

 int length = 10;
        string randomString = randomStringGenerator.Generate(length);
纵性 2024-10-15 01:50:22

创建大文件的有效方法:

    FileStream fs = new FileStream(@"C:\temp\out.dat", FileMode.Create);
    fs.Seek(1024 * 6, SeekOrigin.Begin);
    System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
    fs.Write(encoding.GetBytes("test"), 0, 4);
    fs.Close();

但是该文件将为空(除了末尾的“测试”)。不清楚你到底想要做什么——包含数据的大文件,还是只是大文件。您也可以修改它以稀疏地在文件中写入一些数据,但不完全填满它。
如果您确实希望整个文件充满随机数据,那么我能想到的唯一方法是使用上面乔恩的随机字节。

The efficient way to create a large file:

    FileStream fs = new FileStream(@"C:\temp\out.dat", FileMode.Create);
    fs.Seek(1024 * 6, SeekOrigin.Begin);
    System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
    fs.Write(encoding.GetBytes("test"), 0, 4);
    fs.Close();

However this file will be empty (except for the "test" at the end). Not clear what is it exactly you are trying to do -- large file with data, or just large file. You can modify this to sparsely write some data in the file too, but without filling it up completely.
If you do want the entire file filled with random data, then the only way I can think of is using Random bytes from Jon above.

只是我以为 2024-10-15 01:50:22

一种改进是用数据填充所需大小的缓冲区并立即将其全部刷新。

An improvement would be to fill a buffer of the desired size with the data and flushing it all at once.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文