从整个分区选择与从小数据集表中选择之间的区别?

发布于 2024-11-09 11:54:13 字数 478 浏览 3 评论 0原文

我是分区新手。

是否存在性能差异

select * from my_partitionedData where date = '20110523' 

之间

select * from my_Data where date = '20110523' 

my_partitionedData 是按 date 按 1 天分区的表,与 my_Data 是一个仅包含 '20110523' 数据的表,并且两个表具有相同的结构?

另一个问题 - 如果 my_partitionedData 的所有分区都位于同一文件组中,运行这些选择的性能是否会有差异? (注意 - 选择始终为 1 天)

I am new to partitioning.

Would be there a difference in performance between

select * from my_partitionedData where date = '20110523' 

and

select * from my_Data where date = '20110523' 

where my_partitionedData is a table partitioned by date by 1 day and my_Data is a table which has only data for '20110523' and both tables have same structure?

The other question - would be there a difference in performance in running these selects if all the partitions of the my_partitionedData are in the same file group? (note - the select is always for 1 day)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

浅忆流年 2024-11-16 11:54:13

与 SQL 中的其他所有内容一样,您需要进行测试才能确定。

话虽这么说,我认为你应该获得相同的性能。

在幕后,分区表基本上是许多较小的表在逻辑上联合在一起。如果您在分区表中按天进行分区,并且非部分表只有一天的数据,则执行计划和性能应该几乎相同。

Like everything else in SQL, you will need to test to be sure.

That being said, I think you should get identical performance.

Behind the scenes, a partitioned table is basically a lot of smaller tables logically unioned together. If you are partitioning by day in you partitioned table, and your non-part table has only one day of data, the execution plan and performance should be pretty much identical.

痴情换悲伤 2024-11-16 11:54:13

如果返回相同的数据集,分区表和非分区表将返回具有相同 IO 的数据。如果分区表的碎片较少,则磁盘头随机查找检索页面的 IO 延迟将会减少,但总而言之,100k 数据就是 100k 数据。

您没有提到是否考虑对索引进行分区。分区索引是减少查找数据行位置所必须遍历的级别数的绝佳方法。对具有相同功能的索引和表进行分区是最佳解决方案。

If one returns the same data set a partitioned and non-partitioned table will return the data with the same IO. If the partitioned table has less fragmentation there would be a reduction in the IO delay from a random seek of the disk heads to retrieve the pages but all in all 100k of data is 100k of data.

You did not mention if you were considering partitioning the index. Partitioning index is an excellent way to reduce the number of levels which must be traversed to find the location of the data row. Partitioning indexes and tables with the same function is the optiomal solution.

初与友歌 2024-11-16 11:54:13

其中 my_partitionedData 是一个表
按日期划分,间隔 1 天,
my_Data 是一个只有数据的表
对于“20110523”,两个表都有
相同的结构?

后者会减少访问时间。

另一个问题 - 谁会在那里
跑步表现的差异
这些选择是否所有分区
my_partitionedData 位于相同的
文件组? (注意 - 选择的是
始终为 1 天)

尽管有 1 天的数据,但在这种情况下访问时间将会更长。

需要分区来提高大型表和具有不同访问模式的表的可扩展性和可管理性。

您创建了两个表来存储有关每天记录的信息,另一方面,为每天的数据创建一个表是最容易设计和理解的,但这些表不一定针对性能、可扩展性和可管理性进行优化,特别是当表变大。

where my_partitionedData is a table
partitioned by date by 1 day and
my_Data is a table which has only data
for '20110523' and both tables have
same structure?

The later one will less access time.

The other question - whould be there a
difference in performance in running
these selects if all the partitions of
the my_partitionedData are in the same
file group? (note - the select is
always for 1 day)

The access time will be more in this case despite of 1 day data.

Partitioning is required to improve the scalability and manageability of large tables and tables that have varying access patterns.

You created two tables to store information about each day records and on the other hand a single table for each day data is the easiest to design and understand, but these tables are not necessarily optimized for performance, scalability, and manageability, particularly as the table grows larger.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文