链接服务器性能和选项
在工作中,我们有两台服务器,其中一台正在运行很多人使用的应用程序,该应用程序具有 SQL Server 2000 后端。 我很长时间以来都可以自由地查询它,但无法向其中添加任何内容,例如存储过程或额外的表。
这导致我们将第二个 SQL Server 链接到第一个 SQL Server,并且我构建了一个存储过程库,使用链接服务器从双方查询数据。 其中一些查询花费的时间比我想要的要长。
有人可以给我指出一些关于使用链接服务器的好文章吗? 我特别感兴趣的是找出两者之间正在传输哪些数据,因为通常大多数 sql 语句都可以远程执行,但我感觉它可能正在传输整个表,它通常只是一个小型最终的连接本地表。
另外,我当前有哪些链接服务器选项:
- 排序规则兼容的真实
- 数据访问真实的
- Rpc 真实的
- Rpc 输出真实
- 使用远程排序规则错误
- 排序名称(空白)
- 连接超时 0
- 查询超时 0
编辑:
只是想我将更新这篇文章我使用带有动态参数的开放查询一段时间来提高性能,感谢您的提示。 然而,当您最终处理字符串时,这样做可能会使查询变得更加混乱。 最终今年夏天我们将SQL Server升级到2008并实现了实时数据镜像。 老实说,对于我的任务来说,开放查询的速度已经接近本地查询的速度,但镜像确实使 SQL 更容易处理。
At work we have two servers, one is running an application a lot of people use which has an SQL Server 2000 back end. I have been free to query this for a long time but can't add anything to it such as stored procedures or extra tables.
This has lead to us having a second SQL Server linked to the first one and me building up a library of stored procedures that query data from both sides using linked server. Some of these queries are taking longer than what I would like.
Can someone point me to some good articles about using linked servers? I am particularly interested in finding out what data is being transferred between the two as usually the majority of the sql statement could be performed remotely but I have the feeling it may be transferring the full tables, it is usually just a join to a small final table locally.
Also what do the linked server options do I currently have:
- Collation Compatible True
- Data Access True
- Rpc True
- Rpc Out True
- Use Remote Collation False
- Collation Name (Blank)
- Connection Timeout 0
- Query Timeout 0
EDIT:
Just thought I would update this post I used openqueries with dynamic parameters for a while to boost performance, thanks for the tip. However doing this can make queries more messy as you end up dealing with strings. finally this summer we upgraded SQL Server to 2008 and implemented live data mirroring. To be honest the open queries were approaching the speed of local queries for my tasks but the mirroring has certainly made the sql easier to deal with.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
当您使用链接服务器进行此类联接时,重要的是让您立即连接到的服务器(“本地”)成为拥有最多数据的服务器,而链接服务器仅提供一小部分数据,否则,是的,它将提取执行连接所需的尽可能多的数据。
替代方案包括将数据子集复制到临时表,并完成尽可能多的工作以精简结果以及链接服务器可以执行的任何预处理,然后在“本地”端进行联接。
您可能会发现,通过反转操作方式,连接到您无法控制的服务器(他们需要为您创建链接服务器),然后通过链接连接到您的服务器,您可以轻松提高性能。 如果您需要对必须创建存储过程的数据进行主要工作 - 则将数据推送到您的服务器并在那里使用您的存储过程。
在某些情况下,我只是让链接服务器每晚执行一次此类摘要的创建,并将其推送到本地服务器,然后本地服务器通过连接执行其工作。
When you use linked servers for joins like this, it is important to have the server you are immediately connected to ("local") be the one with the most of the data, where the linked server is only providing a small part of the data, otherwise, yes, it will pull as much data as it needs to perform the join.
Alternatives include copying a subset of the data across to a temporary table with as much work done to slim down the results and any pre-processing that the linked server can perform, and then do the join on the "local" side.
You may find you can easily boost performance by reversing the way you do it, connecting to the server you have no control over (they'll need to make a linked server for you) and then connecting to your server over the link. If you need to do major work with the data where you would have to create sprocs - then push the data onto your server and use your sprocs there.
In some cases, I simply had the linked server perform a nightly creation of this kind of summary which it pushed to the local server, and then the local server performed its work with the join.
涉及跨链接服务器半连接的查询往往效率不高。 您可能最好使用 OPENQUERY 来填充将数据存储到本地临时表中,然后从那里对其进行处理。
Queries involving semi-joins across a linked server tend not to be very efficient. You might be better off using OPENQUERY to populate data into a local temporary table and then work on it from there.
几年前,我用 SQL 2000 编写了一个远程链接服务器应用程序,并遇到了与您描述的相同的性能问题。 为了获得最佳性能,我最终多次重写了存储过程。
我广泛使用临时表。 我发现将大量远程数据检索到临时表中,然后连接到它、操作它等的成本较低。正如您所描述的,将本地表连接到远程表非常慢。
尽管我不明白我所看到的很多内容,但显示执行计划和显示估计执行计划往往会有所帮助。
我不知道是否真的有一种有效的方法来使用远程服务器执行这些查询,因为在对抗链接服务器时,SQL Server 似乎无法利用其正常的优化。 可能感觉您正在转移整个表,因为事实上这就是正在发生的情况。
我想知道复制方案是否适合您。 通过将数据存储在本地服务器上,您应该能够编写将根据需要执行的正常查询。
我不知道有什么好的文章可以向您指出。 当我编写更复杂的 SQL Server 应用程序时,我开始认为我需要更好地了解 SQL Server 底层的工作原理。 为此,我们购买了由 Kalen Delaney 编辑的 MS Press Inside Microsoft SQL Server 2005 系列。 第 1 卷:存储引擎绝对是一个起点,但我还没有深入了解它。 由于我最近的几个项目都没有涉及到SQL Server,所以对它的研究就比较松懈了。
I wrote a remote Linked Server application in SQL 2000 a couple of years ago and came across the same performance issues you describe. I ended up rewriting my stored procedures several times in order to obtain the best performance.
I used temporary tables extensively. I found that it was less expensive to retrieve large amounts of remote data into a temp table, then join to it, manipulate it, etc. Joining local to remote tables was very slow as you desribe.
Display Execution Plan and Display Estimated Execution Plan tended to help although I did not understand a lot of what I was looking at.
I don't know if there really is a efficient way to do these queries with a remote server because it seems like SQL Server cannot take advantage of its normal optimizations when going against a Linked Server. It may feel like you are transferring the entire table because in fact that is what is happening.
I am wondering if a replication scenario might work for you. By having the data on your local server, you should be able to write normal queries that will perform as desired.
I do not know of any good articles to point you towards. As I write more complicated SQL Server applications, I started to think that I needed a better understanding of how SQL Server worked underneath. To that end we bought the MS Press Inside Microsoft SQL Server 2005 series edited by Kalen Delaney here at work. Volume 1: The Storage Engine is definitely the place to start but I have not gotten that far into it. Since my last few projects have not involved SQL Server, my study of it has gotten lax.
我建议在游标循环中使用动态开放查询而不是链接连接。
这是我能够复制 MS Access 链接联接性能的唯一方法(至少对于单个远程表而言)
ms sql 中的常规链接联接通过在巨大的表中专门提取所有内容而效率太低。
-- 我想知道游标循环内的 openqueries 有什么不好? 如果操作正确,就不存在锁定问题。
I would advise dynamic openqueries in a cursor loop instead of linked joins.
This is the only way i've been able to replicate MS Access' linked join performance (at least for single remote tables)
Regular linked joins in ms sql are too inefficient by pulling everything specially in humongous tables..
-- I would like to know what is so bad about openqueries inside cursor loops? if done correctly, there are no locking issues.
这是一个非常慷慨的问题,可能有很多解决方案。 但正如我们目睹的许多用户说他们已经尝试了一切。
解决我的问题的是..
我将sql server 2000从sp2升级到SP4,如果您已经在sql server 2000上安装了sp4,则运行Instcat.sql。 根据我的经验,如果您对所有其他解决方法感到疲惫不堪,我可以向您保证,这肯定会起作用。
谢谢,
米塔莱什
[电子邮件受保护]
Its a very generous problem, which may have many solutions. But as we have witnessed so many user saying that they have tried everything.
What solved my problem is..
I upgraded sql server 2000 from sp2 to SP4 and if you already have sp4 on sql server 2000 then run Instcat.sql. As per my experience I can assure you this will work for sure, if you are exhausted with all the other workarounds.
Thanks,
Mithalesh
[email protected]
您是否可以在服务器上设置单独的数据库而不是使用链接服务器?
Is there a possibility that you could set up a separate database on the server rather than using a linked server?
动态 SQL 和函数可用于解决硬编码名称问题。 例如,我正在尝试一种实现,其中函数 ufn_linkedDatabase(@ Purpose nvarchar(255)) 输入“cpi.cpi”(目的 CPI,子目的默认值)返回
生产环境中的“[SERVER-NAME.DOMAIN.LCL,2000].[CPI]”(我们使用 SQL Server 的备用端口号,我不知道为什么,包括在链接服务器名称中)。 然后在@template varchar(max)中组装一条SQL命令,其中表达式@{cpi.cpi}代表链接服务器和数据库,然后
@workstring = REPLACE(@template, N'@{cpi.cpi}', ...) 。 该函数如何实际获取数据库名称与过程是分开的 - 查找表很好。
问题 - 执行 OPENQUERY(),这可能至少更好,除非链接服务器选项“排序规则兼容”设置为“true”,以便可以在链接服务器上执行更多任务 - 即使在快速网络上也很重要,我们的服务器机房内部网络速度相当快 - 要执行 OPENQUERY() 我可能需要分别处理“cpi.cpi.server”和“cpi.cpi.database”和“cpi.cpi.server.database”。 而且,我最终可能会使用这种设计编写一个应用程序,在这种情况下,它就被过度设计了。 不过,这意味着该函数本身不必是任何花哨的工作。
无论如何,使用快速网络硬件来解决问题可能是更便宜的答案。
Dynamic SQL and a function can be used to get around the hard-coded name question. For instance, I'm trying an implementation where function ufn_linkedDatabase(@purpose nvarchar(255)) with input 'cpi.cpi' (purpose CPI, sub-purpose default) returns
'[SERVER-NAME.DOMAIN.LCL,2000].[CPI]' in the production environment (where we use alternate port number for SQL Server, I don't know why, including in linked server name). Then a SQL command is assembled in @template varchar(max) with the expression @{cpi.cpi} representing the linked server and database, and then
@workstring = REPLACE(@template, N'@{cpi.cpi}', ...) . How the function actually gets the database name is separate from the procedures - a lookup table is nice.
Issues - to do OPENQUERY(), which is probably still better at least unless the linked server option "collation compatible" is set "true" so that more of the task can be executed on the linked server - important even on a fast network, and our server room internal network is respectably fast - to do OPENQUERY() I probably need to handle 'cpi.cpi.server' and 'cpi.cpi.database' and 'cpi.cpi.server.database' separately. And, I may end up writing exactly one application using this design, in which case it's over-designed. Still, that means that the function itself doesn't have to be any kind of fancy work.
Throwing fast network hardware at the problem may be the cheaper answer, anyway.
避免连接到链接服务器表。
可以使用四部分命名来进行连接,但成本更高。 您的联接可以包含可用于限制来自链接服务器的数据集并使用索引列的条件。
示例:
此查询还在连接中应用了链接服务器在计算连接之前可以使用的条件。
推荐的方法是使用 OPENQUERY。
通过使用 OPENQUERY 避免连接,本地服务器仅发送要远程执行的查询,而不是发送用于连接的一组 ID。
使用该链接检索一组数据并在本地执行计算。 使用临时表(用于即席查询)或在夜间作业中将行插入永久表中。
开始事务可能会失败,具体取决于喜欢的服务器中是否设置了远程事务协调器。 使用它会消耗更多的资源。
还要考虑到您正在访问运行应用程序的生产服务器,虽然您没有指定它,但我认为可以安全地假设正在使用大量事务并执行插入和更新。 您正在从应用程序中夺走资源。
您的目的似乎是将数据用于报告目的。 您的服务器可以设置为具有简单的日志而不是完整的日志,从而提高效率。
您还可以避免由于链接服务器上的数据移动而取消查询。 请始终注意为查询和表提示(例如 NOLOCK)设置适当的隔离级别。
还有请! 切勿将 OPENQUERY(或任何链接服务器)放置在循环内!
Avoid joins to linked server tables.
Using a four part naming for your join can be used but is more expensive. Your join could contain criteria that can be used to limit the data set from the linked server and use the indexed columns.
Example:
This query is also applying a criteria in the join that can be used by the linked server before the join is calculated.
The recommended method is the use of OPENQUERY.
By avoiding the join with the use of OPENQUERY the local server only sends the query to be executed remotely instead sending a set of IDs for the join.
Use the link to retrieve a set of data and perform the calculations locally. Either use a temporary table (for ad hoc queries) or insert the row in a permanent table in a nightly job.
Begining transactions may fail depending if the remote transaction coordinator is set in the liked server. Using it will consume more resources.
Also consider that you are hitting a production server running an application, while you do not specify it, I think is safe to assume that is using heavy transactions and doing inserts and updates. You are taking away resources away from the application.
Your purpose appears to be the use of the data for reporting purposes. Your server can be set to have a simple log instead of full making it more efficient.
You will also avoid your queries to be canceled due to data movement on the linked server. Always be mindful of setting the proper isolation level for your queries and table hints like NOLOCK.
And PLEASE! Never place an OPENQUERY (or any linked server) inside a loop!
皇家痛苦
我们的商店曾经有几个链接服务器,结果就是这样的PITA。
首先,存在与您描述的类似的严重性能问题。 当我看到网络 I/O 统计数据时,我感到震惊。 尽管付出了一切努力,我们还是未能提示 SQL Server 采取合理的行为。
另一个问题是存储过程将这些链接服务器名称硬编码到各处,无法覆盖它们。 因此,开发人员无法轻松地在其开发沙箱上测试任何涉及链接服务器的功能。 这是创建通用的单元测试套件的主要障碍。
最后,我们完全放弃了链接服务器,并将数据同步转移到网络服务。
Royal pain
We used to have several linked servers at our shop and it turned out to be such a PITA.
First of all, there were severe performance problems similar to what you describe. I was shocked when i saw network I/O stats. Despite all efforts, we failed to hint SQL Server into reasonable behavior.
Another problem was that stored procs had these linked server names hardcoded everywhere, with no way to override them. So developers couldn't easily test on their development sandboxes any functionality that touched linked servers. This was a major obstacle for creating a universally usable unit-test suite.
In the end we ditched linked servers completely and moved data synchronization to web-services.