将数组拆分为大小有限的 CSV 字符串
我正在寻找一种有效的方法将大 int[] 转换为 csv 字符串的 string[] ,其中每个 csv 最多限制为 4000 个字符。数组中的值可以是 1 到 int.MaxValue 之间的任何值。
这是我的最终代码:
public static string[] GetCSVsFromArray(int[] array, int csvLimit)
{
List<string> parts = new List<string>();
StringBuilder sb = new StringBuilder();
foreach(int id in array)
{
string intId = id.ToString();
if (sb.Length + intId.Length < csvLimit)
sb.Append(intId).Append(",");
else
{
if (sb.Length > 0)
sb.Length--;
parts.Add(sb.ToString());
sb.Length = 0;
}
}
if(sb.Length>0)
parts.Add(sb.ToString());
return parts.ToArray();
}
有没有更有效的方法来做到这一点?
所以这就是我现在使用的(我能够将返回参数更改为 List 类型以在最后保存 ToArray() 调用):
public static List<string> GetCSVsFromArray(int[] array, int csvLimit)
{
List<string> parts = new List<string>();
StringBuilder sb = new StringBuilder();
foreach(int id in array)
{
string intId = id.ToString();
if (sb.Length + intId.Length < csvLimit)
sb.Append(intId).Append(",");
else
{
if (sb.Length > 0)
sb.Length--;
parts.Add(sb.ToString());
sb.Length = 0;
}
}
if(sb.Length>0)
parts.Add(sb.ToString());
return parts;
}
性能结果:
10,000,000 个项目 csv 4000 个字符的限制
- 原始:2,887.488ms
- GetIntegerDigitCount:3105.355 最终毫秒
- :2883.587 毫秒
,而删除 ToArray() 只节省了 4 毫秒在我的开发机器上调用这似乎在速度慢得多的机器上产生了显着的差异(在 DELL D620 上节省了 200 多毫秒)
I am looking for an efficient way to convert a large int[] into a string[] of csv strings where each csv is limited to a maximum of 4000 characters. The values in the array could be anything between 1 and int.MaxValue.
Here is my final code:
public static string[] GetCSVsFromArray(int[] array, int csvLimit)
{
List<string> parts = new List<string>();
StringBuilder sb = new StringBuilder();
foreach(int id in array)
{
string intId = id.ToString();
if (sb.Length + intId.Length < csvLimit)
sb.Append(intId).Append(",");
else
{
if (sb.Length > 0)
sb.Length--;
parts.Add(sb.ToString());
sb.Length = 0;
}
}
if(sb.Length>0)
parts.Add(sb.ToString());
return parts.ToArray();
}
Is there a more efficient way to do this?
So here is what I am now using (I was able to change the return parameter to the List type to save the ToArray() call at the end):
public static List<string> GetCSVsFromArray(int[] array, int csvLimit)
{
List<string> parts = new List<string>();
StringBuilder sb = new StringBuilder();
foreach(int id in array)
{
string intId = id.ToString();
if (sb.Length + intId.Length < csvLimit)
sb.Append(intId).Append(",");
else
{
if (sb.Length > 0)
sb.Length--;
parts.Add(sb.ToString());
sb.Length = 0;
}
}
if(sb.Length>0)
parts.Add(sb.ToString());
return parts;
}
Performance results:
10,000,000 items csv Limit of 4000 characters
- Original: 2,887.488ms
- GetIntegerDigitCount: 3105.355ms
- Final: 2883.587ms
Whilst I only saved 4ms removing the ToArray() call on my developer machine this seems to make a significant difference on a much slower machine (saved over 200ms on a DELL D620)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
当为每个数字创建新字符串只是为了计算位数时,您会进行大量堆内存分配。使用以下方法计算数字中的位数(请参阅下面的方法)。
因此,不要只是
使用:
结果:
编辑: 大 csv 限制的更多结果
我用来测量时间的代码:
GetIntegerDigitCount:
You are doing a lot of heap memory allocations when creating a new string for each number just to calculate number of digits. Use following method to calculate number of digits in the number (see method below).
So instead of
Just use:
Results:
EDIT: More results on large csv limit
Code I've used to measure time:
GetIntegerDigitCount:
这里的 Linq 可以稍微加快速度。经过一些修改后,您的代码将如下所示:
Linq here can speed up things a bit. Your code will look something like this after few modifications: