导出大量数据
这是我的问题..
我们的网站中有两种类型的报告,网格中显示的数据和立即下载为报告的数据。
这些报告可以包含几年的数据(超过 100 万行),我们一直允许客户根据日期范围下载数据,但我们已经开始限制他们查看数据的时间,以防止我们网站出现性能问题。然而,即使在很小的日期范围内,数据仍然变得相当大,现在它们正在扩展,如果下载太多,我们的内存会在几演出中激增并耗尽内存。
我的问题是,我不想限制他们的数据,所以我试图找出一个好的解决方案,让他们可以根据需要下载尽可能多的数据。
我可以通过仅返回每页数据来限制他们看到的内容,这样就不会出现性能问题,但下载始终是问题所在。
我研究过异步,但未能成功使其工作,因为当我加载数据时它会占用内存。
有想法吗?想法?建议?
代码示例:
// Get Data SqlConnection con = new SqlConnection(); SqlCommand cmd = new SqlCommand(); SqlDataAdapter da; DataSet ds = new DataSet(); con.ConnectionString = "MyConnectionString"; con.Open(); cmd.Connection = con; cmd.CommandType = CommandType.StoredProcedure; cmd.CommandText = "MyStoredProc"; da = new SqlDataAdapter(cmd); da.Fill(ds); con.Close(); StringWriter sw = new StringWriter(); HtmlTextWriter htw = new HtmlTextWriter(sw); DataGrid dg = new DataGrid(); dg.DataSource = ds.Tables[0]; dg.DataBind(); dg.RenderControl(htw); Response.ClearContent(); Response.ContentType = "application/vnd.ms-excel"; Response.AddHeader("Content-Disposition", "attachment;filename=Report.xls"); Response.Write(sw.ToString()); Response.End();
当我使用我的数据运行此程序时......大约有 800k 行,我的内存出现峰值并且出现内存不足错误,并使事情变得更糟......它总是占用 RenderControl 直到完成
Here is my issue..
We have 2 types of reports in our site, data shown in grid and data instantly downloaded as a report.
These reports can contain a few years of data (1+ million rows), we have been allowing our customers to download the data against a date range, but we have started limited how long they can view data to prevent performance issues in our site. However, the data is still getting pretty large even on a small date range, now that they are expanding, and if they download too much, our memory spikes over a few gigs and run out of memory.
Question I have is, I rather not limit their data so I'm trying to figure out a good solution to allow them to download as much as they want.
I can limit what they see by only returning data per page so there is no performance issues, however downloading is always the problem.
I've looked into async but haven't successfully been able to get that to work, as it spikes the memory when I'm loading the data.
Ideas? Thoughts? Suggestions?
Code Example:
// Get Data SqlConnection con = new SqlConnection(); SqlCommand cmd = new SqlCommand(); SqlDataAdapter da; DataSet ds = new DataSet(); con.ConnectionString = "MyConnectionString"; con.Open(); cmd.Connection = con; cmd.CommandType = CommandType.StoredProcedure; cmd.CommandText = "MyStoredProc"; da = new SqlDataAdapter(cmd); da.Fill(ds); con.Close(); StringWriter sw = new StringWriter(); HtmlTextWriter htw = new HtmlTextWriter(sw); DataGrid dg = new DataGrid(); dg.DataSource = ds.Tables[0]; dg.DataBind(); dg.RenderControl(htw); Response.ClearContent(); Response.ContentType = "application/vnd.ms-excel"; Response.AddHeader("Content-Disposition", "attachment;filename=Report.xls"); Response.Write(sw.ToString()); Response.End();
When I run this with my data.. which is approximately 800k rows, my memory spikes and I get an out of memory error, and make things worse.. it always hogs up at the RenderControl until its done
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我假设数据来自后端数据库。如果是这样,您不应让用户等待此操作完成。这是一个糟糕的 UI 设计,尤其是当内存高达 4GB 时。
我同意其他建议,您应该考虑改进代码和设计,可能有助于减少占用空间。但无论如何,您应该为此拥有类似预定作业架构的东西。
您让用户点击下载搜索/文件,并将其添加到数据库表的队列中。有一个 db/.net 进程来处理这些作业并在服务器上生成正确格式的文件。如果数据相同并且您使用正确的命名约定,则可以在许多用户之间重复使用该文件。然后,用户应该能够转到下载队列页面并查看他已安排的所有下载。完成后,他将能够下载该文件。
如果您有不允许这样做的要求,请发表评论进行解释。
i'm assuming the data comes from a backend database. If so, you should not have the user wait for this operation to complete. It is a bad UI design, esp when the memory can go upto 4GB.
I agree with the other suggestions that you should look into improving your code and design probably to help reduce the footprint. But regardless, you should have something like a scheduled job architecture for this.
You let the user to hit download on a search/file and it gets added to a queue in a database table. There is a db/.net process that comes along and processes these jobs and generate a file in the proper format on the server. It may be possible to reuse the file across many users if the data is the same and you use proper naming conventions. The user should then be able to go to a download queue page and see all his downloads that he has scheduled. Once complete, he will be able to download the file.
if you have a requirement that will not let you do this, please put a comment explaining it.
好的,我们开始吧:
已完成。
使用数据读取器,边写边编写 HTML - 您永远不会将所有数据保存在内存中。你的方法永远不会扩展。
Ok, here we go:
Finished.
Get a data reader, write the HTML as you go - you never keep all the data in memory. Your approach never will scale.
您可以重写要分页并循环访问数据集的存储过程吗?然后重写输出部分以流式传输文件,而不是一次性输出所有内容(您当前的方法基本上只是写出一个 HTML 表)。
对数据进行分页将阻止下载过程将所有数据存储在内存中
Can you rewrite the stored procedure to be paged and loop through the dataset? Then rewrite the output portion to stream the file instead of outputting it all in one go (your current method is basically just writing out an HTML table).
Paging the data will keep the download process from storing all of that data in memory
解决了!!!
当我在 Excel 中导出大量数据时,我遇到了同样的问题。
解决方案:您可以使用打开XMl dll来解决您的问题。
使用这个dll,您可以在Excel中导出大量数据,并且内存消耗也会更少。
您可以从这里获取更多信息
https://msdn.microsoft.com/ en-us/library/office/hh180830(v=office.14).aspx
Solved !!!
Same problem was face by me when i was working on Exporting Large Amounts of Data in Excel.
solution : you can use open XMl dll to solve your problem.
using this dll you can export large amount of data in excel and memory consumption will also be less.
more information you can get from here
https://msdn.microsoft.com/en-us/library/office/hh180830(v=office.14).aspx