Polybase元数据查询

发布于 2025-01-17 01:05:05 字数 1382 浏览 1 评论 0原文

所有,

Polybase 都会每 1 分钟 +/- 运行一次此查询或查询的变体。这显示了它从中获取数据的源系统。虽然通常非常快,但如果在维护期间重建任何聚集索引,则会阻止此查询。

这似乎不是外部表调用的一部分,因为我已停止运行该作业,并且我们仍然看到这一点。有趣的是,它似乎只在启用了分区的表上调用(或者至少我没有捕获对未分区的表的查询)。

问题:

  • 有人对 Polybase 在这里做什么有想法吗?
  • 为什么需要每 1 分钟左右运行一次这个查询?
  • 有没有办法在索引维护等过程中防止/减慢调用速度?
  • 终止源系统上的进程有问题吗?

感谢帮助和意见。

SELECT PARTITION_FUNCTIONS.name
    ,QUOTENAME(COLUMNS.name)
    ,cast(PARTITIONS.partition_number AS NVARCHAR(10))
FROM "<DBName>".sys.indexes AS INDEXES
    ,"<DBName>".sys.partitions AS PARTITIONS
    ,"<DBName>".sys.index_columns AS INDEX_COLUMNS
    ,"<DBName>".sys.columns AS COLUMNS
    ,"<DBName>".sys.partition_functions AS PARTITION_FUNCTIONS
    ,"<DBName>".sys.partition_schemes AS PARTITION_SCHEMES
WHERE INDEXES.object_id = object_id(@0)
    AND INDEXES.type IN (@1,@2,@3)
    AND INDEX_COLUMNS.partition_ordinal = @4
    AND INDEXES.object_id = PARTITIONS.object_id
    AND INDEXES.index_id = PARTITIONS.index_id
    AND INDEXES.object_id = INDEX_COLUMNS.object_id
    AND INDEXES.index_id = INDEX_COLUMNS.index_id
    AND INDEXES.data_space_id = PARTITION_SCHEMES.data_space_id
    AND PARTITION_SCHEMES.function_id = PARTITION_FUNCTIONS.function_id
    AND INDEXES.object_id = COLUMNS.object_id
    AND INDEX_COLUMNS.column_id = COLUMNS.column_id```

All,

Polybase keeps running this query, or variation of, every 1 minute +/-. This shows on the source system it's getting it's data from. While usually very fast, if any clustered index is rebuilt during maintenance is blocks this query.

This doesn't appear to be part of the external table call as I've stopped the job running that and we are still seeing this. Interestingly, it appears to only be called on tables with partitioning enabled (or at least I haven't captured the query on tables that aren't partitioned).

Question:

  • Anyone have thoughts on what polybase is doing here?
  • Why it needs to run this query every 1 minute or so?
  • Is there a way to prevent/slowdown the calls during stuff like index maintenance?
  • Issues with killing the process on the source system?

Appreciate the help and input.

SELECT PARTITION_FUNCTIONS.name
    ,QUOTENAME(COLUMNS.name)
    ,cast(PARTITIONS.partition_number AS NVARCHAR(10))
FROM "<DBName>".sys.indexes AS INDEXES
    ,"<DBName>".sys.partitions AS PARTITIONS
    ,"<DBName>".sys.index_columns AS INDEX_COLUMNS
    ,"<DBName>".sys.columns AS COLUMNS
    ,"<DBName>".sys.partition_functions AS PARTITION_FUNCTIONS
    ,"<DBName>".sys.partition_schemes AS PARTITION_SCHEMES
WHERE INDEXES.object_id = object_id(@0)
    AND INDEXES.type IN (@1,@2,@3)
    AND INDEX_COLUMNS.partition_ordinal = @4
    AND INDEXES.object_id = PARTITIONS.object_id
    AND INDEXES.index_id = PARTITIONS.index_id
    AND INDEXES.object_id = INDEX_COLUMNS.object_id
    AND INDEXES.index_id = INDEX_COLUMNS.index_id
    AND INDEXES.data_space_id = PARTITION_SCHEMES.data_space_id
    AND PARTITION_SCHEMES.function_id = PARTITION_FUNCTIONS.function_id
    AND INDEXES.object_id = COLUMNS.object_id
    AND INDEX_COLUMNS.column_id = COLUMNS.column_id```

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

铁憨憨 2025-01-24 01:05:05

查询外部表元数据来识别字段、数据类型和长度是很正常的,但 Polybase 只有两次这样做,一次是在外部表创建时,一次是在查询数据时。如果您每分钟都看到它有变化,请检查:

  • 是否在一个循环中删除和创建外部表。
  • 如果是循环查询外部表。
  • 如果它是横向扩展组的一部分,并且其他节点是否正在查询外部表。
    如果您无法确定在何处触发它,则应在不使用时删除外部表,以避免索引维护问题,这比创建一个进程来无限期终止这些会话更好。
    您可以在我的书“Polybase 数据虚拟化实践”中找到有关 Polybase 如何工作的更多信息。

It is normal to query the external table metadata to identify the fields, data types and lengths, but there are only two times when Polybase does this, at the external table creation, and when querying the data. If you're seeing it every minute with variations, check:

  • If you're dropping and creating the external table in a cycle.
  • If you're querying the external table in a cycle.
  • If it is part of a scale-out group, and if other nodes are querying the external table.
    If you can't identify where it is being triggered, you should drop your external table while not in use to avoid issues with the index maintenance, better than creating a process to kill these sessions indefinitely.
    You can find additional information about how Polybase works in my book "Hands-on data virtualization with Polybase".
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文