MDX 性能与 T-SQL
我有一个包含超过 6 亿条记录的表的数据库,以及一组对数据库进行复杂搜索操作的存储过程。 即使表上有合适的索引,存储过程的性能也很慢。 数据库的设计是普通的关系型数据库设计。 我想将数据库设计更改为多维并使用 MDX 查询而不是传统的 T-SQL 查询,但问题是: MDX 查询在性能方面是否优于传统 T-SQL 查询? 如果是,这会在多大程度上提高查询的性能?
感谢您的帮助。
I have a database containing tables with more than 600 million records and a set of stored procedures that make complex search operations on the database.
The performance of the stored procedures is so slow even with suitable indexes on the tables.
The design of the database is a normal relational db design.
I want to change the database design to be multidimensional and use the MDX queries instead of the traditional T-SQL queries but the question is:
Is the MDX query better than the traditional T-SQL query with regard to performance?
and if yes, to what extent will that improve the performance of the queries?
Thanks for any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
苹果和橙子:分析服务 OLAP 多维数据集是一种与 SQL Server 数据库根本不同的存储类型,并且它们旨在执行不同的操作。从技术上讲,MDX 并不比 T-SQL“更快”,反之亦然——它们只是语言,但为不同的需求而设计。
话虽如此,多维数据集通常最适合对静态数据进行数字分析,例如随着时间的推移聚合大量销售/交易/任何记录。相比之下,如果架构和索引构造良好,则传统的关系数据库通常可以很好地进行搜索。一个简单的判断方法:如果您的 SQL 查询必须执行很多操作,
那么多维数据集可能会有所帮助(它是为聚合数学函数 - sum() 和 group by 设计的)。 OTOH,如果您的查询执行很多
操作,那么多维数据集可能不会有帮助,我会专注于调整架构、查询和索引,如果数据可以适当分区,也许还可以调整表分区。
您是否有聚集索引并覆盖与查询匹配的非聚集索引?
Apples and oranges: An analysis services OLAP cube is a fundamentally different type of storage than a SQL Server database, and they are designed to do different things. Technically MDX is not "faster" than T-SQL, or vice versa -- they are just languages, but designed for different needs.
Having said that, a cube is usually what works best for doing numeric analysis of static data, such as aggregating large numbers of sales/transactions/whatever records over time. In contrast, a traditional relational database generally works just fine, if the schema and indexes are well constructed, for search. A simple way to judge: if your SQL queries have to do a lot of
then a cube may help (it's designed for aggregate math functions - sum() and group by). OTOH if your queries do a lot of
then a cube probably will not help, and I would focus instead on tuning the schema, the queries and indexing, and perhaps table partitioning if the data can be suitably partitioned.
Do you have a clustered index and covering non-clustered indexes that match the queries?
MS SSAS OLAP 多维数据集可用于多种存储模式:
关系 (OLAP) - 数据和元数据保留在数据库中,并且添加了更多的物化视图。可能会更快,也可能不会更快。
混合 (HOLAP) - 元数据和(预先计算的)聚合存储在运行 SSAS 实例的新服务器上。这应该会加速使用聚合的所有查询,例如“去年按月的员工总工时”,但钻取到特定记录的查询可能会像以前一样。
多维 OLAP (MOLAP),其中所有数据以及元数据和聚合都将复制到 SSAS 服务器。这通常是最快的,但会重复存储。
在开始之前,您应该考虑优化报告和分析的表布局,换句话说,使用数据仓库 (DW) - 将数据放入 Kimball 星型维度和事实表中。然后,您定期使用 ETL(SSIS) 加载 DW,并将报告和分析指向 DW。您可能根本不需要使用 SSAS——针对星型表布局运行的 SQL 查询通常比针对标准化 DB 操作数据库快得多。
如果这仍然太慢,请在 DW 之上构建 SSAS 多维数据集。
一旦开始加载 DW,您就可以从操作数据库中删除记录,从而使其日常使用更快。
总而言之,我的经验法则是:
1.构建 DW 并设置您的 ETL 流程
2。尝试针对 DW 的 T-SQL 报告,可能就足够了。
3。如果仍然很慢,请以 HOLAP 模式构建 SSAS 多维数据集(在 DW 之上)并使用 MDX 来查询它们。
MS SSAS OLAP cube can be used in several storage modes:
Relational (OLAP) - the data and metadata stays in your DB and few more materialized views are added. May or may not be faster.
Hybrid (HOLAP) - metadata and (pre-calculated) aggregations are stored on a new server running a SSAS instance. This should speed-up all the queries using aggregations, like "total employee hours for last year by month", but queries which drill-through to specific records may be as before.
Multi-dimensional OLAP (MOLAP) where all your data plus metadata and aggregations are copied to the SSAS server. This is usually the fastest, but duplicates storage.
Before starting this, you should consider optimizing you table layout for reporting and analytics, in other words use a data warehouse (DW) -- put your data in a Kimball star dimension and fact tables. Then you load the DW using ETL(SSIS) periodically and point your reporting and analytics to the DW. It may be that you do not need to use SSAS at all -- SQL queries running against a star table layouts are usually considerably faster than against a normalized DB --operational database.
If this is still too slow, build SSAS cubes on top of the DW.
Once you start loading your DW, you may be able to remove records form your operational database, making it faster for every-day use.
To summarize, my rule-of thumb would be:
1. Build a DW and set your ETL process
2. Try T-SQL reports against the DW, it may be good enough.
3. If still slow, build SSAS cubes (on top of the DW) in HOLAP mode and use MDX to query them.
“即使有合适的索引,存储过程的性能也很慢”
如果存储过程是真正的问题,我会感到惊讶,也许过程的使用方式很慢,但根据定义,存储过程不会使它很慢。您是否发现您的程序速度缓慢?你对他们进行过简介吗?在重新设计我的数据库之前,我会深入研究这条路线。多维数据库适用于 OLAP 您的数据库是严格意义上的 OLAP 数据库还是 OLAP 和 OLTP 的混合体?也许您需要反规范化并将 OLTP 设计中的数据复制到非规范化结构中?表中的 6 亿条记录无论如何都不算大,也不算小,但这并不会让我相信删除存储过程会神奇地使事情变得更快。在进入更大的项目来解决问题之前,分析您的存储过程并查看性能瓶颈在哪里。
"The performance of the stored procedures is so slow even with suitable indexes"
I'd be surprised if the stored procedure is the real problem, maybe the the way the procedures are used is slow, but a stored procedure by definition doesn't make it slow. Have you found out what about your procedures is slow? Have your profiled them? I would take a deep long look at that route before redesigning my database. Multi-dimensional databases are for OLAP is your database strictly an OLAP database or is it a hybrid of OLAP and OLTP? Maybe you need to de-normalized and replicate data in your OLTP design into the de-normalize d structure? 600 million records in a table is not by any means huge, it's not small but that doesn't lead me to believe that dropping stored procedures will magically make things fast. Profile your stored procs and see where the performance bottlenecks are before jumping into a bigger project to fix the issue.
您考虑过 PowerPivot(Excel 插件)吗?它使用垂直压缩在本地将数据压缩约 95%,因此您可以随心所欲地进行分析。
http://technet.microsoft.com/en-us/library/ee210692。 ASPX
Have you considered PowerPivot (the Excel add-on)? It uses vertical compression to compress data about 95%, locally, so you can analyze to your heart's content.
http://technet.microsoft.com/en-us/library/ee210692.aspx