从 Guid 获取 SQL 兼容字符串的更快方法
当我分析我的应用程序时(它通过处理一些原始数据创建大量数据库插入),我注意到了这一行特定的代码:
myStringBuilder.AppendLine(
string.Join(
BULK_SEPARATOR, new string[]{myGuid.ToString() ...
请记住,生成的字符串将最终出现在通过 TSQL 命令调用的文件中>BULK INSERT,有没有办法可以更快地完成此步骤?我知道获取字节数组更快,但我不能将其插入文件中。
I noticed this particular line of code when I was profiling my application (which creates a boatload of database insertions by processing some raw data):
myStringBuilder.AppendLine(
string.Join(
BULK_SEPARATOR, new string[]{myGuid.ToString() ...
Bearing in mind that the resultant string is going to end up in a file called via the TSQL command BULK INSERT
, is there a way I can do this step faster? I know that getting a byte array is faster, but I can't just plug that into the file.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
最快和最简单的方法是根本不对原始文件使用
BULK INSERT
。请改用 SqlBulkCopy 类。通过直接通过管道发送数据而不是使用中间文件,应该可以显着加快速度。(您还可以直接使用
Guid
而不进行任何字符串转换,尽管我不能 100% 确定SqlBulkCopy
在内部使用它做什么。)The fastest and simplest way would be not to use
BULK INSERT
with a raw file at all. Instead, use the SqlBulkCopy class. That should speed this up significantly by sending the data directly over the pipe instead of using an intermediate file.(You'll also be able to use the
Guid
directly without any string conversions, although I can't be 100% sure whatSqlBulkCopy
does internally with it.)您没有表明您从哪里获取指南。另外,我不认为获取字节会更快,因为您将执行 Guid 类上的 ToString 方法已经执行的操作,迭代字节并转换为字符串值。
相反,我认为这段代码在性能方面可能需要改进的一些一般领域是(假设您在循环中执行此操作):
您是否在循环的新迭代中重用 myStringBuilder 实例?您应该将长度(不是容量)属性设置为 0,然后使用它重建字符串。这将避免必须预热新的 StringBuilder 实例,并且已经为更大的字符串分配了内存。
在 myStringBuilder 上调用 Append,而不是调用 String.Join。 String.Join 将预分配一堆内存,然后返回一个字符串实例,您将再次分配该实例(如果在第一次迭代时)或复制到已分配的空间中。没有理由这样做两次。相反,迭代您正在创建的数组(或展开循环,看起来您有一个固定大小的数组)并调用 Append,传入 guid,然后传入 BULK_SEPARATOR。从末尾删除单个字符会更容易,顺便说一句,如果您实际附加了 Guid,只需将 StringBuilder 的 Length 属性减一即可。
You aren't indicating where you are getting the guid from. Also, I don't believe that getting the bytes are going to be any faster, as you are going to do what the ToString method on the Guid class already does, iterate through the bytes and convert to a string value.
Rather, I think a few general areas that this code can possibly be improved upon in terms of performance are (and assuming you are doing this in a loop):
Are you reusing the myStringBuilder instance upon a new iteration of your loop? You should be setting the Length (not the Capacity) property to 0, and then rebuild your string using that. This will prevent having to warm up a new StringBuilder instance, and the memory allocation for a larger string will already have been made.
Use calls to Append on myStringBuilder instead of calling String.Join. String.Join is going to preallocate a bunch of memory and then return a string instance which you will just allocate again (if on the first iteration) or copy into already-allocated space. There is no reason to do this twice. Instead, iterate through the array you are creating (or expand the loop, it seems you have a fixed-size array) and call Append, passing in the guid and then then BULK_SEPARATOR. It's easier to remove the single character from the end, btw, just decrement the Length property of the StringBuilder by one if you actually appended Guids.
如果时间很关键 - 您能否提前创建足够长的 GUID 列表并将其转换为字符串,然后在代码中使用它?是在 C# 中,还是在 SQL Server 中,具体取决于您的要求?
If time is critical - could you pre-create a sufficiently long list of GUID's converted to string ahead of time, and then use it in your code? Either in C#, or possibly in SQL Server, depending on your requirements?