看起来我的数据仓库项目明年将迁移到 Teradata(从 SQL Server 2005)。
我正在寻找有关 Teradata 最佳实践的资源 - 从其 SQL 方言的限制到使查询良好执行的习惯用法和约定 - 特别是如果它们突出显示与 SQL Server 2005 显着不同的内容。 SQL 的艺术(更侧重于 Oracle)。
我的业务流程目前采用 T-SQL 存储过程,并且相当依赖 SQL Server 2005 功能(例如 PIVOT、UNPIVOT 和公用表表达式),每月从 4TB 数据仓库生成约 2700 万行输出。
Looks like my data warehouse project is moving to Teradata next year (from SQL Server 2005).
I'm looking for resources about best practices on Teradata - from limitations of its SQL dialect to idioms and conventions for getting queries to perform well - particularly if they highlight things which are significantly different from SQL Server 2005. Specifically tips similar to those found in The Art of SQL (which is more Oracle-focused).
My business processes are currently in T-SQL stored procedures and rely fairly heavily on SQL Server 2005 features like PIVOT, UNPIVOT, and Common Table Expressions to produce about 27m rows of output a month from a 4TB data warehouse.
发布评论
评论(4)
一个起点是:http://www.teradataforum.com/
这可能有点晚了,但根据我了解到的有关 Teradata 的情况,我可以警告您一些事情。
尽可能经常使用最新版本。
对于 V12,优化器被重写,数据库现在性能好多了。
尝试认识到 SQL Server 和 Teradata 是非常不同的野兽,大多数概念不会很好地过渡。
不要低估主要指标的重要性。
与其他数据库相比,teradata 使用的锁非常原始。
不要使用 TERA 模式。 您没有任何遗留代码,ANSI 模式要优越得多并且受到广泛鼓励。
连接索引是非常有用的工具,但它们并不能提供所有答案。
并行性,花时间了解 FASTLOAD、MULTILOAD 和 TPUMP 的工作原理,并了解如何在 ETL 策略中利用它。
如果您尝试运行需要高性能的查询,请不要使用任何强制转换,优化器将不会使用统计信息来生成最佳执行计划。
处理日期将会很痛苦,只是一个警告。
Teradata非常面向DDL,在创建表时尝试理解所有相关的语法。
压缩是一个很棒的工具,如果您有任何在表中重复的值,请使用它。
Teradata 提供的工具并不多,请做好构建大量工具的准备。 现有的工具非常昂贵。
不幸的是,我对SQL Server了解不多,所以我不能说SQL Server中的哪些工具出现在Teradata中。
希望这可以帮助
One place to start is here: http://www.teradataforum.com/
This might be a little late, but there are a few things which I can warn you about Teradata which I have learned.
Use the most recent version as often as possible.
For V12 the optimizer was re-written and the database performs much better now.
Try to realize that SQL Server and Teradata are very different beasts, most of the concepts will not transition well.
Do not underestimate the importance of a primary index.
The locks that teradata uses are very primitive when compared to other databases.
Do NOT use TERA mode. You do not have any code which is legacy, ANSI mode is far superior and is widely encouraged.
Join indexes are very helpful tools, but they do not provide all the answers.
Parallelism, take the time to understand how FASTLOAD, MULTILOAD, and TPUMP works and find out how one can leverage it with their ETL strategy.
If you are attempting to run a query which needs to be performant, do not use any casts, the optimizer will not use statistics to generate the best execution plan.
Working with dates are going to be a pain, just a warning.
Teradata is very DDL oriented, try to understand all the syntax related when creating a table.
Compression is a wonderful tool, if you have any values which are repeated in a table, make use of it.
There are not many tools available with Teradata, be prepared to build a lot. The tools that exist are very expensive.
Unfortunately, I do not know much about SQL Server, so I cannot say what tools in SQL Server appear in Teradata.
Hope this helps
我还会研究最近推出的 Teradata Developer Exchange 以及 TeradataForum 和 Teradata 主网站上的论坛。
I would also look into the recently launched Teradata Developer Exchange as well as the TeradataForum and forums on Teradata's main website.
我不知道网上有什么好的参考资料。 Teradata 有一些设计手册可供下载,但它们更多的是指导手册,而不是“最佳实践”。 在这里查看它们:http://www.info .teradata.com/DataWarehouse/eTeradata-BrowseBy.cfm?page=Teradata%20Database
或者,您需要找到友好的 Teradata 专家来交流想法。 亲自尝试 Teradata,或寻找具有 Teradata 经验的本地顾问。
Teradata 的最佳实践并不是一个引起大量讨论的话题,大多数最佳技巧往往是发现它们的人的专有知识。
对不起,
大卫·斯图尔森
萨蒂扬计算机服务
I don't know of any good references available online. Teradata has some design manuals that are available for download, but they're more instruction manuals and not "best practices" as such. check them out here: http://www.info.teradata.com/DataWarehouse/eTeradata-BrowseBy.cfm?page=Teradata%20Database
Alternatively, you need to find a friendly Teradata expert to bounce ideas off. Try Teradata themselves, or find a local consultant with Teradata experience.
Best Practices on Teradata isn't a topic that gets lots of discussions and most of the best tricks tend to be proprietary knowledge of the person/people who discovered them.
Sorry,
David Stewardson
Satyam Computer Services
在 Google 搜索“Teradata 最佳实践”时,我在列表顶部找到了 TERADATA ADVISORY GROUP SETS业务对象和 TERADATA 客户的最佳实践
编辑:正如您所指出的,这只是广告,看看您如何处理这些。 请记住,我不知道 Teradata 是什么,也看不到自己在公元 22 世纪的任何时候使用它。
中间的一个看起来很有前途,顶部有一个漂亮的长链接树
第一个链接是论坛,可以让您与合适的人取得联系。
Top of the list on a Google search for "Teradata Best Practices" gave me TERADATA ADVISORY GROUP SETS BEST PRACTICES FOR BUSINESS OBJECTS AND TERADATA CUSTOMERS
EDIT: Seeing as that's just advertising, as you've pointed out, see how you go with these. Please bear in mind that I don't have a clue what Teradata is and can't see myself using it any time this side of the 22nd century AD.
The middle one looks promising with it's nice long link tree at the top
and the first link, to the forums, should put you in touch with the right people.