从整个分区选择与从小数据集表中选择之间的区别?
我是分区新手。
是否存在性能差异
select * from my_partitionedData where date = '20110523'
之间
select * from my_Data where date = '20110523'
my_partitionedData
是按 date
按 1 天分区的表,与 my_Data
是一个仅包含 '20110523'
数据的表,并且两个表具有相同的结构?
另一个问题 - 如果 my_partitionedData
的所有分区都位于同一文件组中,运行这些选择的性能是否会有差异? (注意 - 选择始终为 1 天)
I am new to partitioning.
Would be there a difference in performance between
select * from my_partitionedData where date = '20110523'
and
select * from my_Data where date = '20110523'
where my_partitionedData
is a table partitioned by date
by 1 day and my_Data
is a table which has only data for '20110523'
and both tables have same structure?
The other question - would be there a difference in performance in running these selects if all the partitions of the my_partitionedData
are in the same file group? (note - the select is always for 1 day)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
与 SQL 中的其他所有内容一样,您需要进行测试才能确定。
话虽这么说,我认为你应该获得相同的性能。
在幕后,分区表基本上是许多较小的表在逻辑上联合在一起。如果您在分区表中按天进行分区,并且非部分表只有一天的数据,则执行计划和性能应该几乎相同。
Like everything else in SQL, you will need to test to be sure.
That being said, I think you should get identical performance.
Behind the scenes, a partitioned table is basically a lot of smaller tables logically unioned together. If you are partitioning by day in you partitioned table, and your non-part table has only one day of data, the execution plan and performance should be pretty much identical.
如果返回相同的数据集,分区表和非分区表将返回具有相同 IO 的数据。如果分区表的碎片较少,则磁盘头随机查找检索页面的 IO 延迟将会减少,但总而言之,100k 数据就是 100k 数据。
您没有提到是否考虑对索引进行分区。分区索引是减少查找数据行位置所必须遍历的级别数的绝佳方法。对具有相同功能的索引和表进行分区是最佳解决方案。
If one returns the same data set a partitioned and non-partitioned table will return the data with the same IO. If the partitioned table has less fragmentation there would be a reduction in the IO delay from a random seek of the disk heads to retrieve the pages but all in all 100k of data is 100k of data.
You did not mention if you were considering partitioning the index. Partitioning index is an excellent way to reduce the number of levels which must be traversed to find the location of the data row. Partitioning indexes and tables with the same function is the optiomal solution.
后者会减少访问时间。
尽管有 1 天的数据,但在这种情况下访问时间将会更长。
需要分区来提高大型表和具有不同访问模式的表的可扩展性和可管理性。
您创建了两个表来存储有关每天记录的信息,另一方面,为每天的数据创建一个表是最容易设计和理解的,但这些表不一定针对性能、可扩展性和可管理性进行优化,特别是当表变大。
The later one will less access time.
The access time will be more in this case despite of 1 day data.
Partitioning is required to improve the scalability and manageability of large tables and tables that have varying access patterns.
You created two tables to store information about each day records and on the other hand a single table for each day data is the easiest to design and understand, but these tables are not necessarily optimized for performance, scalability, and manageability, particularly as the table grows larger.