如何快速从数据库加载100万条记录?
现在我们有一个包含 1.000.000 的 firebird 数据库,必须在将所有内容加载到 RAM 内存后对其进行处理。为了获得所有这些,我们必须使用 (select * first 1000 ...) 提取数据 8 小时。对此有什么解决办法吗?
Now we have a firebird database with 1.000.000 that must be processed after ALL are loaded in RAM memory. To get all of those we must extract data using (select * first 1000 ...) for 8 hours. What is the solution for this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您的每个“select * first 1000”(如您所描述的)是否都会进行全表扫描?查看这些查询,并确保它们正在使用索引。
Does each of your "select * first 1000" (as you described it) do a full table scan? Look at those queries, and make sure they are using an index.
构造每次读取数据时创建的 DTO 对象需要多长时间?
您正在创建一百万个这样的对象。如果创建一个 DTO 对象需要 29 毫秒,那么这将需要 8 个多小时才能完成。
How long does it take to construct the DTO object that you are creating with each data read?
You are creating a million of these objects. If it takes 29 milliseconds to create one DTO object, then that is going to take over 8 hours to complete.
每个人都假设您正在运行 SQL 查询来从数据库中选择记录之类的东西,
因为这确实需要几秒钟。嗯,在屏幕上显示它的时间要长一点,但执行实际的选择应该快如闪电。
但对 C# 的引用让我认为您正在做其他事情。也许您真正拥有的是实例化一百万个对象的 RBAR 循环。我可以想象这可能需要更长的时间。但即便如此,八小时?时间都去哪儿了?
编辑
我的猜测是正确的,您正在循环中实例化 1000000 个对象。正确的建议是,一旦将所有对象都存储在内存中,就找到其他方法来完成您要做的事情。在不了解更多细节的情况下,很难给出具体细节。但这似乎不太可能是用户界面的想法——哪个用户会仔细阅读一百万个对象?
因此,一般观察就足够了:使用批量操作来实现批量活动。 SQL 数据库擅长处理集合。利用 SQL 的强大功能来处理单个集合中的百万行,而不是作为单独的行。
如果您认为此答案没有帮助,那么您需要向我们提供有关您想要实现的愿望的更多详细信息。
Everybody's been assuming you were running a SQL query to select the records from the database Something like
Because that really would take a few seconds. Well, a little longer to display it on a screen, but executing the actual select should be lightning fast.
But that reference to C# makes me think you're doing something else. Perhaps what you really have is an RBAR loop instantiating one million objects. I can see how that might take a little longer. But even so, eight hours? Where does the time go?
edit
My guess was right and you are instantiating 1000000 objects in a loop. The correct advice would be to find some other way of doing whatever it is you do once you have got all your objects in memory. Without knowing more about the details it is hard to give specifics. But it seems unlikely this is a UI think - what user is going to peruse a million objects?
So a general observation will have to suffice: use bulk operations to implement bulk activity. SQL databases excel at handling sets. Leverage the power of SQL to process your million rows in a single set, rather than as individual rows.
If you don't find this answer helpful then you need to give us more details regarding want you're trying to achieve.
您需要进行什么样的处理才能将它们加载到内存中而不仅仅是通过 SQL 语句进行处理?
根据我想要做什么,我使用两种技术。
假设有某种人工密钥(身份),分批工作,增加最后处理的身份值。
假设有某种
将数据 BCP 输出到文本文件,搅动更新,然后将其 BCP 返回,记住在 IN 步骤之前关闭约束和索引。
What sort of processing do you need to do that would require to load them in memory and not just process them via SQL statements?
There are two techniques I use that work depending on what I am trying to do.
Assuming there is some sort of artificial key (identity), work in batches, incrementing the last identity value processed.
BCP the data out to a text file, churn through the updates, then BCP it back in, remembering to turn off constraints and indexes before the IN step.
看看这个:
http://www.firebirdfaq.org/faq13/
Take a look at this:
http://www.firebirdfaq.org/faq13/