如何优化数据表到集合转换的运行时?

发布于 2024-11-02 10:41:31 字数 3453 浏览 0 评论 0原文

我有一个必须通过 odbc 连接到的数据库。

数据获取需要应用程序。 2分钟。生成的 DataTable 有 350000 条记录。

我正在尝试将数据表转换为该对象图。结果集没有主键,主键是通过我从中获取数据的视图指定的。

public class PriceCurve
{
    public PriceCurve(DataTable dt)
    {
        this.Id = int.Parse(dt.AsEnumerable().First()["ID"].ToString());
        this.Prices = new List<Price>();
        GetPrices(dt);
    }

    public int Id { get; private set; }
    public IList<Price> Prices { get; set; }

    private void GetPrices(DataTable dt)
    {
        foreach (DataColumn column in dt.Columns)
        {
            switch (this.GetPriceProviderType(column)) // parses ColumnName to Enum
            {
                case Price.PriceProvider.A:
                    {
                        this.Prices.Add(new Price(Price.PriceProvider.A, dt.AsEnumerable()));
                    }

                    break;
                case Price.PriceProvider.B:
                    {
                        this.Prices.Add(new Price(Price.PriceProvider.B, dt.AsEnumerable()));
                    }

                    break;
            }
        }

    public class Price
    {
        public enum PriceProvider
        {
            A, B
        }

        public Price(PriceProvider type, IEnumerable<DataRow> dt)
        {
            this.Type = type;
            this.TradingDates = new List<TradingDate>();
            this.GetTradingDates(type, dt);
        }

        public IList<TradingDate> TradingDates { get; set; }
        public PriceProvider Type { get; set; }

        private void GetTradingDates(PriceProvider type, IEnumerable<DataRow> dt)
        {
            var data = dt.Select(column => column["TRADING_DATE"]).Distinct();

            foreach (var date in data)
            {
                this.TradingDates.Add(new TradingDate(date.ToString(), type, dt));
            }
        }

        public class TradingDate
        {
            public TradingDate(string id, PriceProvider type, IEnumerable<DataRow> dt)
            {
                this.Id = id;
                this.DeliveryPeriodValues = new Dictionary<int, double?>();
                this.GetDeliveryPeriodValues(type, dt);
            }

            public string Id { get; set; }
            public IDictionary<int, double?> DeliveryPeriodValues { get; set; }

            private void GetDeliveryPeriodValues(PriceProvider type, IEnumerable<DataRow> dt)
            {

                foreach (var row in dt.Where(column => column["TRADING_DATE"].ToString() == this.Name))
                {
                    try
                    {
                        this.DeliveryPeriodValues.Add(
                            int.Parse(row["DELIVERY_PERIOD"].ToString()),
                            double.Parse(row[Enum.GetName(typeof(Price.PriceProvider), type)].ToString()));
                    }
                    catch (FormatException e)
                    {
                        this.DeliveryPeriodValues.Add(
                            int.Parse(row["DELIVERY_PERIOD"].ToString()),
                            null);
                    }
                }    
            }
        }
    }

我创建一个对象,其中包含一个包含两个对象的列表。这两个对象中的每一个都包含一个包含 1000 个对象的列表。这 1000 个对象中的每一个都包含一个包含 350 对的字典。

它要么在调试期间使 Visual Studio 2010 崩溃,要么因内存不足而失败,要么需要几分钟(不可接受)才能执行。

解决这个问题的最佳方法是什么。我是 C# 新手,不知道如何优化这个巨大数据或我的对象图的循环。任何帮助表示赞赏。

I have a database to which i have to connect through odbc.

The data fetch takes app. 2 minutes. and the resulting DataTable has 350000 records.

I am trying to transform the data table into this object graph. The resultset has no primary key, the primary key is specified through the view from which i fetch data.

public class PriceCurve
{
    public PriceCurve(DataTable dt)
    {
        this.Id = int.Parse(dt.AsEnumerable().First()["ID"].ToString());
        this.Prices = new List<Price>();
        GetPrices(dt);
    }

    public int Id { get; private set; }
    public IList<Price> Prices { get; set; }

    private void GetPrices(DataTable dt)
    {
        foreach (DataColumn column in dt.Columns)
        {
            switch (this.GetPriceProviderType(column)) // parses ColumnName to Enum
            {
                case Price.PriceProvider.A:
                    {
                        this.Prices.Add(new Price(Price.PriceProvider.A, dt.AsEnumerable()));
                    }

                    break;
                case Price.PriceProvider.B:
                    {
                        this.Prices.Add(new Price(Price.PriceProvider.B, dt.AsEnumerable()));
                    }

                    break;
            }
        }

    public class Price
    {
        public enum PriceProvider
        {
            A, B
        }

        public Price(PriceProvider type, IEnumerable<DataRow> dt)
        {
            this.Type = type;
            this.TradingDates = new List<TradingDate>();
            this.GetTradingDates(type, dt);
        }

        public IList<TradingDate> TradingDates { get; set; }
        public PriceProvider Type { get; set; }

        private void GetTradingDates(PriceProvider type, IEnumerable<DataRow> dt)
        {
            var data = dt.Select(column => column["TRADING_DATE"]).Distinct();

            foreach (var date in data)
            {
                this.TradingDates.Add(new TradingDate(date.ToString(), type, dt));
            }
        }

        public class TradingDate
        {
            public TradingDate(string id, PriceProvider type, IEnumerable<DataRow> dt)
            {
                this.Id = id;
                this.DeliveryPeriodValues = new Dictionary<int, double?>();
                this.GetDeliveryPeriodValues(type, dt);
            }

            public string Id { get; set; }
            public IDictionary<int, double?> DeliveryPeriodValues { get; set; }

            private void GetDeliveryPeriodValues(PriceProvider type, IEnumerable<DataRow> dt)
            {

                foreach (var row in dt.Where(column => column["TRADING_DATE"].ToString() == this.Name))
                {
                    try
                    {
                        this.DeliveryPeriodValues.Add(
                            int.Parse(row["DELIVERY_PERIOD"].ToString()),
                            double.Parse(row[Enum.GetName(typeof(Price.PriceProvider), type)].ToString()));
                    }
                    catch (FormatException e)
                    {
                        this.DeliveryPeriodValues.Add(
                            int.Parse(row["DELIVERY_PERIOD"].ToString()),
                            null);
                    }
                }    
            }
        }
    }

i create one object, which contains a list with two objects. Each of these two objects contains a list with 1000 objects. Each of these 1000 objects contains a dictionary with 350 pairs.

It either crashes visual studio 2010 during debug, fails because of OutOfMemory or takes minutes (unacceptable) to execute.

What is the best approach to this problem. i am new to c# and do not know how to optimize the looping through this huge data or my object graph. Any help is appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

断舍离 2024-11-09 10:41:31

它要么在调试期间使 Visual Studio 2010 崩溃,要么因内存不足而失败,要么需要几分钟的时间
(不可接受)执行。

你让我笑了。真的。

  • 350.000 个节点在具有 .NET 的 32 位计算机上具有挑战性。加上一些开销,你就死定了。使用对象,而不是数据表,这会严重破坏内存。

  • 需要几分钟几乎是你的决定/编程。使用对象列表,而不是数据表。使用分析器。不要犯初学者的错误,例如:

var data = dt.Select(column =>column["TRADING_DATE"]).Distinct();

不需要这样做,稍后在代码中处理双打。独特就是昂贵。对其进行简介。

foreach (var row in dt.Where(column => column["TRADING_DATE"].ToString() == this.Name))

即按名称进行 350.000 行查找以获取列的索引,相比之下有很多串。

获取一个分析器并找出您的时间到底花在哪里。请摆脱表并使用对象 - 与对象列表相比,DataTable 占用内存并且速度慢。是的,这需要几分钟的时间。主要原因:

  • 你的编程。不是什么耻辱。现在就学习 Go 对象/结构。
  • ODBC。加载数据需要时间,特别是当您在加载时不处理数据(DataReader)而是等​​待所有数据加载完毕,并且 ODBC 速度不快。 350.000 行,良好的网络,直接 SQL Server 可能需要 30 秒 - 同一台机器更少。

It either crashes visual studio 2010 during debug, fails because of OutOfMemory or takes minutes
(unacceptable) to execute.

YOu made me laugh. Really.

  • 350.000 nodes is challenging on a 32 bit machine with .NET. Add some overhead and you are dead. Use objects, not adata table which is VERY memory destroying.

  • takes minutes is pretty much your decision / programming. Use a list of objects, not a data table. Use a profiler. DOnt make beginner mistakesl ike:

var data = dt.Select(column => column["TRADING_DATE"]).Distinct();

No need for that, deal with doubles later inthe code. Distinct is expensive. Profile it.

foreach (var row in dt.Where(column => column["TRADING_DATE"].ToString() == this.Name))

That is 350.000 row lookups by name to get the index of the column, compared by a lot of tostring.

Get a profiler and find out where you exactly spend your time. Get please rid of the table and use objects - DataTable is a memory hog and SLOW compared to a list of objects. And yes, it will take minutes. Main reasons:

  • Your programming. Not a shame. Just learn, Go objets / structs NOW.
  • ODBC. Takes time to just load the data, especially as you dont process swhile loading (DataReader) but wait for allto ahve loaded, and ODBC is NOT fast. 350.000 rows, good network, direct SQL Server is maybe 30 seconds - same machine less.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文