.NET 中的字符串连接和线程

发布于 2024-12-10 15:42:35 字数 2588 浏览 0 评论 0原文

(纯粹出于好奇)在 VB.net 中,我测试了连接 100k 字符串,发现仅一个线程就可以在 23 毫秒内完成。两个线程(每个线程连接 50k)然后最后将两个线程连接起来需要 30 毫秒。从性能角度来看,在仅处理 100k 级联时使用多个线程似乎没有什么好处。然后我尝试了 300 万个字符串连接,每个处理 1.5MM 的两个线程总是拆除一个处理所有 300 万个字符串的线程。我想象在某个时候使用 3 个线程会变得有益,然后是 4 个,依此类推。在 .NET 中是否有更有效的方法来连接数百万个字符串?线程值得使用吗?

在大约1MM的字符串连接处,看来多线程可以提高性能

仅供参考,这是我写的代码:

Imports System.Text
Imports System.Threading
Imports System.IO
Public Class Form1
    Dim sbOne As StringBuilder
    Dim sbTwo As StringBuilder
    Dim roof As Integer
    Dim results As DataTable
    Sub clicked(s As Object, e As EventArgs) Handles Button1.Click
        results = New DataTable
        results.Columns.Add("one thread")
        results.Columns.Add("two threads")
        results.Columns.Add("roof")

        For i As Integer = 1 To 3000000 Step 100000
            roof = i
            Dim test() As Double = runTest()
            results.Rows.Add(test(0), test(1), i)
            Console.WriteLine(roof)
        Next

        Dim output As New StringBuilder
        For Each C As DataColumn In results.Columns
            output.Append(C)
            output.Append(Chr(9))
        Next
        output.Append(vbCrLf)
        For Each R As DataRow In results.Rows
            For Each C As DataColumn In results.Columns
                output.Append(R(C))
                output.Append(Chr(9))
            Next
            output.Append(vbCrLf)
        Next
        File.WriteAllText("c:\users\username\desktop\sbtest.xls", output.ToString)
        Console.WriteLine("done")

    End Sub
    Function runTest() As Double()
        Dim sb As New StringBuilder
        Dim started As DateTime = Now
        For i As Integer = 1 To roof
            sb.Append(i)
        Next
        Dim result As String = sb.ToString
        Dim test1 As Double = Now.Subtract(started).TotalMilliseconds

        sbOne = New StringBuilder
        sbTwo = New StringBuilder
        Dim one As New Thread(AddressOf tOne)
        Dim two As New Thread(AddressOf tTwo)
        started = Now
        one.Start()
        two.Start()
        Do While one.IsAlive Or two.IsAlive
        Loop
        result = String.Concat(one.ToString, two.ToString)
        Dim test2 As Double = Now.Subtract(started).TotalMilliseconds
        Return {test1, test2}
    End Function
    Sub tOne()
        For i As Integer = 1 To roof / 2
            sbOne.Append(i)
        Next
    End Sub
    Sub tTwo()
        For i As Integer = roof / 2 To roof
            sbTwo.Append(i)
        Next
    End Sub
End Class

(Out of pure curiosity) In VB.net, I tested concatenating 100k strings and found out one thread alone did it in 23 milliseconds. Two threads (each concatenating 50k) then joining the two at the end took 30 milliseconds. Performance wise, it didn't seem beneficial to utilize multiple threads when dealing with only 100k concatenations. Then I tried 3 million string concatenations and two threads each handling 1.5MM always demolished one thread handling all 3 million. I imagine at some point using 3 threads becomes beneficial, then 4, and so on. Is there a more efficient way to concatenate millions of strings in .NET? Are threads worth using?

At around 1MM string concatenations, it appears multiple threads can improve performance

fyi, this is the code I wrote:

Imports System.Text
Imports System.Threading
Imports System.IO
Public Class Form1
    Dim sbOne As StringBuilder
    Dim sbTwo As StringBuilder
    Dim roof As Integer
    Dim results As DataTable
    Sub clicked(s As Object, e As EventArgs) Handles Button1.Click
        results = New DataTable
        results.Columns.Add("one thread")
        results.Columns.Add("two threads")
        results.Columns.Add("roof")

        For i As Integer = 1 To 3000000 Step 100000
            roof = i
            Dim test() As Double = runTest()
            results.Rows.Add(test(0), test(1), i)
            Console.WriteLine(roof)
        Next

        Dim output As New StringBuilder
        For Each C As DataColumn In results.Columns
            output.Append(C)
            output.Append(Chr(9))
        Next
        output.Append(vbCrLf)
        For Each R As DataRow In results.Rows
            For Each C As DataColumn In results.Columns
                output.Append(R(C))
                output.Append(Chr(9))
            Next
            output.Append(vbCrLf)
        Next
        File.WriteAllText("c:\users\username\desktop\sbtest.xls", output.ToString)
        Console.WriteLine("done")

    End Sub
    Function runTest() As Double()
        Dim sb As New StringBuilder
        Dim started As DateTime = Now
        For i As Integer = 1 To roof
            sb.Append(i)
        Next
        Dim result As String = sb.ToString
        Dim test1 As Double = Now.Subtract(started).TotalMilliseconds

        sbOne = New StringBuilder
        sbTwo = New StringBuilder
        Dim one As New Thread(AddressOf tOne)
        Dim two As New Thread(AddressOf tTwo)
        started = Now
        one.Start()
        two.Start()
        Do While one.IsAlive Or two.IsAlive
        Loop
        result = String.Concat(one.ToString, two.ToString)
        Dim test2 As Double = Now.Subtract(started).TotalMilliseconds
        Return {test1, test2}
    End Function
    Sub tOne()
        For i As Integer = 1 To roof / 2
            sbOne.Append(i)
        Next
    End Sub
    Sub tTwo()
        For i As Integer = roof / 2 To roof
            sbTwo.Append(i)
        Next
    End Sub
End Class

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

冷情妓 2024-12-17 15:42:35

线程是为比字符串连接更昂贵的任务而设计的。

字符串连接涉及分配和复制内存;这不是一项计算密集型任务。
处理计算密集型任务时应使用多线程,并避免阻塞 UI 线程。

线程对于并行化等待不同事物的任务也很有用(例如,到多个慢速服务器的网络 IO,或网络与磁盘 IO)

Threads are designed for tasks more expensive than string concatenation.

String concatenation involves allocating and copying memory; it's not a very comnpute-intensive task.
Multi-threading should be used when dealing with computationally intensive tasks, and to avoid blocking the UI thread.

Threading can also be useful to parallize tasks that wait on different things (eg, network IO to multiple slow servers, or network vs. disk IO)

软糖 2024-12-17 15:42:35

查看 StringBuilder 上的 .EnsureCapacity 子例程。如果您正在进行大量连接并且大致知道字符数,则可以一次性初始化字符串生成器的缓冲区,而不是让它动态发生。您应该会看到更多改进。

http://msdn.microsoft.com/en-us /library/system.text.stringbuilder.ensurecapacity.aspx

Check out the .EnsureCapacity subroutine on StringBuilder. If you are doing a lot of concatenation and know roughly the number of characters, you can initialize the stringbuilder's buffer all at once instead of letting it happen dynamically. You should see some more improvement.

http://msdn.microsoft.com/en-us/library/system.text.stringbuilder.ensurecapacity.aspx

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文