List.Add 与 List.Add 或 List.Add 性能

发布于 2024-12-17 04:46:58 字数 4927 浏览 0 评论 0原文

我有代码将 ~100,000 个项目添加到列表中。

如果我添加一个字符串或对象数组,代码几乎会立即运行(不到 100 毫秒),但如果我尝试添加一个结构数组,则仅 .Add 调用就需要近 1.5 秒。

为什么使用 struct[] 时会对性能产生如此大的影响?

这是我的结构:

public struct LiteRowInfo
{
    public long Position;
    public int Length;
    public int Field;
    public int Row;

    public LiteRowInfo(long position, int length, int field, int row)
    {
        this.Position = position;
        this.Length = length;
        this.Field = field;
        this.Row = row;
    }
}

编辑2:字符串方法的性能比结构的性能更快: 我很欣赏这些评论,看起来创建结构本身确实有额外的开销。我想我只会创建 2 个单独的列表来存储位置和长度以提高性能。

private void Test()
    {
        Stopwatch watch = new Stopwatch();

        watch.Start();
        List<LiteRowInfo[]> structList = new List<LiteRowInfo[]>();

        for (int i = 0; i < 100000; i++)
        {
            LiteRowInfo[] info = new LiteRowInfo[20];

            for (int x = 0; x < 20; x++)
            {
                LiteRowInfo row;
                row.Length = x;
                row.Position = (long)i;
                info[x] = row;
            }
            structList.Add(info);
        }
        Debug.Print(watch.ElapsedMilliseconds.ToString());

        watch.Reset();
        watch.Start();

        List<string[]> stringList = new List<string[]>();

        for (int i = 0; i < 100000; i++)
        {
            string[] info = new string[20];

            for (int x = 0; x < 20; x++)
            {
                info[x] = "String";
            }
            stringList.Add(info);
        }

        Debug.Print(watch.ElapsedMilliseconds.ToString());
    }

编辑:这是所有相关代码: 注意:如果我只注释掉 pos.Add(rowInfo);行,性能与 string[] 或 int[] 类似。

        private void executeSqlStream()
    {
        List<LiteRowInfo[]> pos = new List<LiteRowInfo[]>();

        long currentPos = 0;

        _stream = new MemoryStream();
        StreamWriter writer = new StreamWriter(_stream);

        using (SqlConnection cnn = new SqlConnection(_cnnString))
        {
            cnn.Open();
            SqlCommand cmd = new SqlCommand(_sqlString, cnn);

            SqlDataReader reader = cmd.ExecuteReader();

            int fieldCount = reader.FieldCount;
            int rowNum = 0;
            UnicodeEncoding encode = new UnicodeEncoding();
            List<string> fields = new List<string>();
            for (int i = 0; i < fieldCount; i++)
            {
                fields.Add(reader.GetFieldType(i).Name);
            }
            while (reader.Read())
            {
                LiteRowInfo[] rowData = new LiteRowInfo[fieldCount];
                for (int i = 0; i < fieldCount; i++)
                {
                    LiteRowInfo info;
                    if (reader[i] != DBNull.Value)
                    {
                        byte[] b;
                        switch (fields[i])
                        {
                            case "Int32":
                                b = BitConverter.GetBytes(reader.GetInt32(i));
                                break;
                            case "Int64":
                                b = BitConverter.GetBytes(reader.GetInt64(i));
                                break;
                            case "DateTime":
                                DateTime dt = reader.GetDateTime(i);
                                b = BitConverter.GetBytes(dt.ToBinary());
                                break;
                            case "Double":
                                b = BitConverter.GetBytes(reader.GetDouble(i));
                                break;
                            case "Boolean":
                                b = BitConverter.GetBytes(reader.GetBoolean(i));
                                break;
                            case "Decimal":
                                b = BitConverter.GetBytes((float)reader.GetDecimal(i));
                                break;
                            default:
                                b = encode.GetBytes(reader.GetString(i));
                                break;
                        }
                        int len = b.Length;

                        info.Position = currentPos += len;
                        info.Length = len;
                        info.Field = i;
                        info.Row = rowNum;
                        currentPos += len;
                        _stream.Write(b, 0, len);
                    }
                    else
                    {
                        info.Position = currentPos;
                        info.Length = 0;
                        info.Field = i;
                        info.Row = rowNum;
                    }
                    rowData[i] = info;
                }
                rowNum++;
                pos.Add(rowData);
            }
        }
    }

I have code that is adding ~100,000 items to a List.

If I add an array of strings or objects the code runs almost instantly (under 100 ms), but if I try to add an array of structs, it takes almost 1.5 seconds just for the .Add calls.

Why is there such a performance impact when using a struct[]?

Here is my struct:

public struct LiteRowInfo
{
    public long Position;
    public int Length;
    public int Field;
    public int Row;

    public LiteRowInfo(long position, int length, int field, int row)
    {
        this.Position = position;
        this.Length = length;
        this.Field = field;
        this.Row = row;
    }
}

EDIT 2: Performances of the string method is faster than that of the struct:
I appreciate the comments, it does seem like there is additional overhead in creating the struct its self. I think I will just create 2 seperate list to store the position and length to improve performance.

private void Test()
    {
        Stopwatch watch = new Stopwatch();

        watch.Start();
        List<LiteRowInfo[]> structList = new List<LiteRowInfo[]>();

        for (int i = 0; i < 100000; i++)
        {
            LiteRowInfo[] info = new LiteRowInfo[20];

            for (int x = 0; x < 20; x++)
            {
                LiteRowInfo row;
                row.Length = x;
                row.Position = (long)i;
                info[x] = row;
            }
            structList.Add(info);
        }
        Debug.Print(watch.ElapsedMilliseconds.ToString());

        watch.Reset();
        watch.Start();

        List<string[]> stringList = new List<string[]>();

        for (int i = 0; i < 100000; i++)
        {
            string[] info = new string[20];

            for (int x = 0; x < 20; x++)
            {
                info[x] = "String";
            }
            stringList.Add(info);
        }

        Debug.Print(watch.ElapsedMilliseconds.ToString());
    }

EDIT: Here is all relevant code:
Note: If I comment out only the pos.Add(rowInfo); line, the performance is similar to that of a string[] or int[].

        private void executeSqlStream()
    {
        List<LiteRowInfo[]> pos = new List<LiteRowInfo[]>();

        long currentPos = 0;

        _stream = new MemoryStream();
        StreamWriter writer = new StreamWriter(_stream);

        using (SqlConnection cnn = new SqlConnection(_cnnString))
        {
            cnn.Open();
            SqlCommand cmd = new SqlCommand(_sqlString, cnn);

            SqlDataReader reader = cmd.ExecuteReader();

            int fieldCount = reader.FieldCount;
            int rowNum = 0;
            UnicodeEncoding encode = new UnicodeEncoding();
            List<string> fields = new List<string>();
            for (int i = 0; i < fieldCount; i++)
            {
                fields.Add(reader.GetFieldType(i).Name);
            }
            while (reader.Read())
            {
                LiteRowInfo[] rowData = new LiteRowInfo[fieldCount];
                for (int i = 0; i < fieldCount; i++)
                {
                    LiteRowInfo info;
                    if (reader[i] != DBNull.Value)
                    {
                        byte[] b;
                        switch (fields[i])
                        {
                            case "Int32":
                                b = BitConverter.GetBytes(reader.GetInt32(i));
                                break;
                            case "Int64":
                                b = BitConverter.GetBytes(reader.GetInt64(i));
                                break;
                            case "DateTime":
                                DateTime dt = reader.GetDateTime(i);
                                b = BitConverter.GetBytes(dt.ToBinary());
                                break;
                            case "Double":
                                b = BitConverter.GetBytes(reader.GetDouble(i));
                                break;
                            case "Boolean":
                                b = BitConverter.GetBytes(reader.GetBoolean(i));
                                break;
                            case "Decimal":
                                b = BitConverter.GetBytes((float)reader.GetDecimal(i));
                                break;
                            default:
                                b = encode.GetBytes(reader.GetString(i));
                                break;
                        }
                        int len = b.Length;

                        info.Position = currentPos += len;
                        info.Length = len;
                        info.Field = i;
                        info.Row = rowNum;
                        currentPos += len;
                        _stream.Write(b, 0, len);
                    }
                    else
                    {
                        info.Position = currentPos;
                        info.Length = 0;
                        info.Field = i;
                        info.Row = rowNum;
                    }
                    rowData[i] = info;
                }
                rowNum++;
                pos.Add(rowData);
            }
        }
    }

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

优雅的叶子 2024-12-24 04:46:58

鉴于数组本身是一个引用类型,我非常怀疑您实际上看到的是您认为看到的内容。

我怀疑区别不在于向列表添加数组引用 - 我怀疑它首先创建数组。每个数组元素将比引用占用更多空间,因此您必须分配更多内存。这很可能意味着您正在触发垃圾收集。

List.Add 进行基准测试,我建议您多次重复添加对相同数组的引用。

顺便说一句,将数组作为列表元素类型对我来说感觉有点奇怪。有时这是有效的,但我个人会考虑它是否实际上可以封装在另一种类型中。

编辑:您说您已经发布了所有相关代码,但是确实不是 List.Add 的基准代码 - 它包含数据库访问一方面,几乎可以肯定,这比任何内存操作都要花费更长的时间!

Given that the array itself is a reference type, I very much doubt that you're actually seeing what you think you're seeing.

I suspect the difference isn't in adding an array reference to a list - I suspect it's creating the array in the first place. Each array element will take more space than a reference, so you're having to allocate more memory. That may well mean you're also triggering garbage collection.

To benchmark just List<T>.Add, I suggest that you repeatedly add a reference to the same array several times.

As an aside, having an array as the list element type feels like a bit of a smell to me. There are times when that's valid, but personally I would consider whether it's actually something which could be encapsulated in another type.

EDIT: You say you've posted all the relevant code, but that really isn't benchmark code for List<T>.Add - it contains database access for one thing, which is almost certainly taking way longer than any of the in-memory manipulation!

两个我 2024-12-24 04:46:58

代码中可能会发生一些与List<>无关装箱,因为通用列表处理值类型时无需装箱。除非共享代码,否则无济于事。

There could be some boxing happening in the code which is not related to the List<> since generic Lists handle value types without boxing. Unless sharing the code, cannot help.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文