数据缓存技术/技巧/AppFabric
我们的 SQL 表中有数百万条记录,我们对这些数据运行非常复杂的分析以生成报告。
随着表的增长和附加记录的添加,计算时间不断增加,用户必须等待很长时间才能加载网页。
我们正在考虑使用像 AppFabric 这样的分布式缓存,在应用程序加载时将数据加载到内存中,然后根据内存中的数据运行我们的报告。这应该会稍微改善响应时间,因为现在数据位于内存而不是磁盘中。
在我们采取行动并实现这一点之前,我想检查并找出其他人在做什么,以及将数据加载到内存、缓存等的一些最佳技术和实践。当然,您不只是用 100 秒加载整个表内存中有数百万条记录...??
我还研究了 OLAP/数据仓库,这可能会给我们带来比缓存更好的性能。
We have million and millions of records in a SQL table, and we run really complex analytics on that data to generate reports.
As the table is growing and additional records are being added, the computation time is increasing and the user has to wait a long time before the webpage loads.
We were thinking of using a distributed cache like AppFabric to load the data in memory when the application loads and then running our reports off that data in memory. This should improve the response time a little since now data is in memory vs disk.
Before we take the plundge and implement this I wanted to check and find out what others are doing and what are some of the best techniques and practices to load data in memory, caching etc. Surely you don't just load the entire table with 100s of millions of records in memory...??
I was also looking into OLAP / Data warehousing, which might give us better performance rather than caching.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
复杂报告的解决方案是预先计算,因此如果您正在考虑 OLAP,那么您就走在正确的道路上。
The solution to complex reporting is to pre-calculate, so you're on the right path if you're looking at OLAP.
您是否考虑过对数据库进行分区?我们为我们最大的数据库这样做。
话虽如此,正确使用应用程序结构缓存将大大提高大多数 IO 密集型应用程序的性能。
Have you considered partitioning your database? We do this for our largest databases.
Having said that, using app fabric cache correctly will greatly increase performance for most applications that are IO heavy.
这是错误的策略。平面文件更好。
在某些情况下,您会更乐意将相关子集加载到 SQL 中。
这是过度使用数据库的结果。用它更少。
也许吧。然而,平面文件比 RDBMS 更快且更具可扩展性。
好计划。立即购买金博尔的书。你不需要更多的技术。您只需要更好地利用平面文件作为主要文件,并使用 SQL 作为用户进行即席查询(针对子集)的位置。
Bad policy. Flat files are better.
In some cases, you'd be happier loaded relevant subsets into SQL.
That's the consequence of using a database for too much. Use it for less.
Perhaps. Flat files, however, are fast and more scalable than RDBMS.
Good plan. Buy Kimball's book immediately. You don't need more technology. You only need to make better use of flat files as primary and SQL as a place for ad-hoc queries (against subsets) for users.