增强 C# 中 Streamwriter 的性能

发布于 2024-11-02 07:29:43 字数 1962 浏览 0 评论 0原文

在我的程序中,我需要编写大型文本文件(~300 mb),文本文件包含由空格分隔的数字,我使用此代码:

TextWriter guessesWriter = TextWriter.Synchronized(new StreamWriter("guesses.txt"));

private void QueueStart()
    {
        while (true)
        {
            if (writeQueue.Count > 0)
            {
                guessesWriter.WriteLine(writeQueue[0]);
                writeQueue.Remove(writeQueue[0]);
            }
        }
    }

private static void Check()
    {
        TextReader tr = new StreamReader("data.txt");

        string guess = tr.ReadLine();
        b = 0;
        List<Thread> threads = new List<Thread>();
        while (guess != null) // Reading each row and analyze it
        {
            string[] guessNumbers = guess.Split(' ');
            List<int> numbers = new List<int>();
            foreach (string s in guessNumbers) // Converting each guess to a list of numbers
                numbers.Add(int.Parse(s));

            threads.Add(new Thread(GuessCheck));
            threads[b].Start(numbers);
            b++;

            guess = tr.ReadLine();
        }
    }

    private static void GuessCheck(object listNums)
    {
        List<int> numbers = (List<int>) listNums;

        if (!CloseNumbersCheck(numbers))
        {
            writeQueue.Add(numbers[0] + " " + numbers[1] + " " + numbers[2] + " " + numbers[3] + " " + numbers[4] + " " + numbers[5] + " " + numbers[6]);
        }
    }

    private static bool CloseNumbersCheck(List<int> numbers)
    {
        int divideResult = numbers[0]/10;
        for (int i = 1; i < 6; i++)
        {
            if (numbers[i]/10 != divideResult)
                return false;
        }
        return true;
    }

文件 data.txt 包含这种格式的数据:(点表示后面有更多数字相同的逻辑)

1 2 3 4 5 6 1
1 2 3 4 5 6 2
1 2 3 4 5 6 3
.
.
.
1 2 3 4 5 6 8
1 2 3 4 5 7 1
.
.
.

我知道这不是很有效,我正在寻找一些关于如何使其更快的建议。 如果您知道如何比 .txt 更有效地保存大量数字,我将不胜感激。

in my program i need to write large text files (~300 mb), the text files contains numbers seperated by spaces, i'm using this code :

TextWriter guessesWriter = TextWriter.Synchronized(new StreamWriter("guesses.txt"));

private void QueueStart()
    {
        while (true)
        {
            if (writeQueue.Count > 0)
            {
                guessesWriter.WriteLine(writeQueue[0]);
                writeQueue.Remove(writeQueue[0]);
            }
        }
    }

private static void Check()
    {
        TextReader tr = new StreamReader("data.txt");

        string guess = tr.ReadLine();
        b = 0;
        List<Thread> threads = new List<Thread>();
        while (guess != null) // Reading each row and analyze it
        {
            string[] guessNumbers = guess.Split(' ');
            List<int> numbers = new List<int>();
            foreach (string s in guessNumbers) // Converting each guess to a list of numbers
                numbers.Add(int.Parse(s));

            threads.Add(new Thread(GuessCheck));
            threads[b].Start(numbers);
            b++;

            guess = tr.ReadLine();
        }
    }

    private static void GuessCheck(object listNums)
    {
        List<int> numbers = (List<int>) listNums;

        if (!CloseNumbersCheck(numbers))
        {
            writeQueue.Add(numbers[0] + " " + numbers[1] + " " + numbers[2] + " " + numbers[3] + " " + numbers[4] + " " + numbers[5] + " " + numbers[6]);
        }
    }

    private static bool CloseNumbersCheck(List<int> numbers)
    {
        int divideResult = numbers[0]/10;
        for (int i = 1; i < 6; i++)
        {
            if (numbers[i]/10 != divideResult)
                return false;
        }
        return true;
    }

the file data.txt contains data in this format : (dots mean more numbers following the same logic)

1 2 3 4 5 6 1
1 2 3 4 5 6 2
1 2 3 4 5 6 3
.
.
.
1 2 3 4 5 6 8
1 2 3 4 5 7 1
.
.
.

i know this is not very efficient and i was looking for some advice on how to make it quicker.
if you night know how to save LARGE amount of numbers more efficiently than a .txt i would appreciate it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

酒与心事 2024-11-09 07:29:43

提高效率的一种方法是在输出流上使用更大的缓冲区。您正在使用默认值,这可能为您提供 1k 缓冲区,但如果缓冲区小于 64k,您将看不到最大性能。像这样打开你的文件:

new StreamWriter("guesses.txt", new UTF8Encoding(false, true), 65536)

One way to improve efficiency is with a larger buffer on your output stream. You are using the defaults, which give you probably a 1k buffer, but you won't see maximum performance with less than a 64k buffer. Open your file like this:

new StreamWriter("guesses.txt", new UTF8Encoding(false, true), 65536)
韵柒 2024-11-09 07:29:43

您应该读取和写入大数据块(ReadBlock 和 Write),而不是逐行读取和写入(ReadLine 和 WriteLine)。这样,您将减少对磁盘的访问,并获得很大的性能提升。但是您需要管理每行的末尾(查看Environment.NewLine)。

Instead of reading and writing line by line (ReadLine and WriteLine), you should read and write big block of data (ReadBlock and Write). This way you will access disk alot less and have a big performance boost. But you will need to manage the end of each line (look at Environment.NewLine).

往日 2024-11-09 07:29:43

使用BinaryWriter可以提高效率。然后直接写出整数就可以了。这将允许您跳过读取时的解析步骤和写入时的 ToString 转换。

看起来您正在其中创建一堆线程。额外的线程会降低你的性能。您应该在单个线程上完成所有工作,因为线程是非常重量级的对象。

下面是对代码的或多或少的直接转换以使用 BinaryWriter。 (这并没有解决线程问题。)

    BinaryWriter guessesWriter = new BinaryWriter(new StreamWriter("guesses.dat"));
    private void QueueStart()
    {
        while (true)
        {             
            if (writeQueue.Count > 0)
            {
                lock (guessesWriter)
                {
                    guessesWriter.Write(writeQueue[0]);
                }
                writeQueue.Remove(writeQueue[0]);
            }
        }
    }
    private const int numbersPerThread = 6;
    private static void Check()
    {
        BinaryReader tr = new BinaryReader(new StreamReader("data.txt"));
        b = 0;
        List<Thread> threads = new List<Thread>();
        while (tr.BaseStream.Position < tr.BaseStream.Length)
        {
            List<int> numbers = new List<int>(numbersPerThread);
            for (int index = 0; index < numbersPerThread; index++)
            {
                numbers.Add(tr.ReadInt32());
            }
            threads.Add(new Thread(GuessCheck));
            threads[b].Start(numbers);
            b++;
        }
    }

The effeciency could be improved by using BinaryWriter. Then you could just write out integers directly. This would allow you to skip the parsing step on the read and the ToString conversion on the write.

It also looks like you are creating a bunch of threads in there. Additional threads will slow down your performance. You should do all of the work on a single thread, since threads are very heavyweight objects.

Here is a more-or-less direct conversion of your code to use a BinaryWriter. (This does not address the thread problem.)

    BinaryWriter guessesWriter = new BinaryWriter(new StreamWriter("guesses.dat"));
    private void QueueStart()
    {
        while (true)
        {             
            if (writeQueue.Count > 0)
            {
                lock (guessesWriter)
                {
                    guessesWriter.Write(writeQueue[0]);
                }
                writeQueue.Remove(writeQueue[0]);
            }
        }
    }
    private const int numbersPerThread = 6;
    private static void Check()
    {
        BinaryReader tr = new BinaryReader(new StreamReader("data.txt"));
        b = 0;
        List<Thread> threads = new List<Thread>();
        while (tr.BaseStream.Position < tr.BaseStream.Length)
        {
            List<int> numbers = new List<int>(numbersPerThread);
            for (int index = 0; index < numbersPerThread; index++)
            {
                numbers.Add(tr.ReadInt32());
            }
            threads.Add(new Thread(GuessCheck));
            threads[b].Start(numbers);
            b++;
        }
    }
愿得七秒忆 2024-11-09 07:29:43

尝试在两者之间使用缓冲区。有一个 BGufferdSTream。现在您使用非常低效的磁盘访问模式。

Try using a bufferi n between. There is a BGufferdSTream. Right now you use very inefficient disc access patterns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文