SQL Server 与 Access 插入性能,特别是在使用 GUID 时

发布于 2024-07-16 10:06:16 字数 1405 浏览 2 评论 0 原文

我很想知道在使用 Access 2007 作为 SQL Server 2008 的前端时使用顺序 GUID 时如何提高 SQL Server 的性能(请注意,这是我唯一感兴趣的上下文)。

我做了一些测试(并得到了一些相当令人惊讶的结果,特别是在使用 sequential GUID 时来自 SQL Server 的结果:插入性能下降得非常非常快,而且对我来说,下降得这么快是不对的基本上

测试如下:

从 Access 前端,仅使用 VBA,批量插入 100,000 条记录,每批 1000 条, 依次。

  • 我尝试使用身份和连续 GUID 作为 PK。
  • 我在SQL Server 2008 Standard(没有特殊调整,只是默认安装)中尝试过它,并使用 Access 2007 数据库作为后端。 所有表格都链接回前端。

一些结果(更多,原始数据可在 我关于测试的博客文章):

很明显,随着数据库的增长,插入性能会降低,但 SQL Server 在这里的表现根本不是很好。

http://blog.nkadesign.com/wp-content/ uploads/2009/04/chart02.png

SQL Server 结果的扩展视图: http://blog.nkadesign.com/wp-content/ uploads/2009/04/chart03.png

编辑 13APR2009

我发现了一个 我的服务器配置出现问题,我更新了在我的博客上进行测试
感谢大家的回复,对我帮助很大。

I'm interested to know how I could improve the performance of SQL Server when using sequential GUID when using Access 2007 as a front end to SQL Server 2008 (please note it's the only context I'm interested in).

I have made some tests (and gotten some fairly surprising results, in particular from SQL Server when using sequential GUID: the insert performance degrades very very quickly and it doesn't seem right to degrade so quickly to me.

Basically the test is as follow:

From the Access front-end, using VBA only, insert 100,000 records in batches of 1000,
sequentially.

  • I tried it both with a Identity and a sequential GUID as the PK.
  • I tried it in SQL Server 2008 Standard (no special tweaking just default install) as and an Access 2007 database as the back-end. All tables linked back to the front-end.

Some of the results (more, with raw data available on my blog entry about the test):

It's clear that, as the database grows, the insert performance is reduced but SQL Server isn't performing very well at all here.

http://blog.nkadesign.com/wp-content/uploads/2009/04/chart02.png

Expanded view of the results for SQL Server:
http://blog.nkadesign.com/wp-content/uploads/2009/04/chart03.png

Edit 13APR2009

I've found an issue with my server configuration and I updated the tests on my blog.
Thanks to all for your replies, they helped me a lot.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

我最亲爱的 2024-07-23 10:06:16

这里有两件事在起作用。 首先,需要指出的是,对于特定的用例,SQL 不一定能开箱即用地工作得很好。 它是一款专业产品,旨在由知道自己在做什么的人进行调整。

相比之下,Access 的设计可以很好地满足大多数用例,无需任何配置。 第二点涵盖了这种权衡的缺点:

SQL Server 是为可扩展性而设计的。 请注意,只有 100,000 条记录时 Access 的性能如何严重下降。 在 100 万之前,它可能会急剧下降到 SQL 线以下。 相比之下,SQL Server 几乎完全稳定,在大约 45,000 条记录后变化稳定下来,并将继续保持数百万条记录。

编辑我认为这里可能还有其他我们没有看到的东西在起作用。 我认为你的 SQL 数字看起来很糟糕,所以我自己进行了测试。 在运行 Windows Vista 3.6 ghz 和 2 GB RAM 的桌面上,在 SQL Server 上执行带有顺序 GUID 的插入:

  • 平均每秒 1382 次插入,0 条记录

  • 500k 记录时平均每秒 1426 次插入

  • 从 0 到 500k 每秒平均插入 1609.6 次,平均下限为 992 次插入/秒,平均上限为 1989 次插入/秒。

因此,考虑到在正在使用的桌面上运行此程序所产生的正常差异,我认为 SQL Server 插入基本上从 0 条记录线性扩展至 50 万条记录。 在专用的、经过调整的服务器上,我希望获得更高的一致性(更不用说更好的性能):

Excel 图表,每秒插入次数 http://img24.imageshack.us/img24/9485/insertspersecond.jpg

There's two things at play here. First, it's important to point out that SQL doesn't necessarily work very well, for a specific use case, out of the box. It is a professional product designed to be tuned by a person who knows what they're doing.

By comparison, Access is designed to work very well for most use cases without any configuration. The downside of this trade-off is covered in the second point:

SQL Server is designed for scalability. Notice how Access severely degrades with only 100,000 records. It would probably drop very steeply below SQL's line before a million. By comparison, SQL server holds almost perfectly steady, with the variation stabilizing after about 45,000 records and will continue to hold at many millions.

Edit I think there also may be something else at play here we're not seeing. I thought your SQL numbers looked bad, so I ran a test of my own. On my desktop running Windows Vista 3.6 ghz and 2gb of RAM, inserts with sequential GUID on SQL Server performed:

  • Average of 1382 inserts per second at 0 records

  • Average of 1426 inserts per second at 500k records

  • Averaging 1609.6 inserts per second from 0 to 500k with an average floor of 992 inserts/sec and an average ceiling of 1989 inserts/sec.

So accounting for the normal variance incurred by running this on an in-use desktop, I'd say SQL Server inserts basically scale linearly from 0 records to half a million. On a dedicated, tuned server I'd expect even more consistency (not to mention far better performance):

Excel chart, inserts per second http://img24.imageshack.us/img24/9485/insertspersecond.jpg

疏忽 2024-07-23 10:06:16

我的问题是您的测试设置是否代表您的应用程序的实际情况。 简而言之,您测试的东西正确吗?

您的应用程序是否会一次附加大量记录?

或者它是否会基于 SQL SELECT 追加批量记录?

如果是后者,您可能会考虑尝试在服务器端完成所有操作,特别是如果 SELECT 中的源表位于服务器上。 重要的是要认识到,使用 ODBC,批量追加将作为每行的单个插入发送到 SQL Server(与测试代码中基于记录集的方法类似)。 如果将同一进程完全移至服务器端,则可以作为批处理操作来完成。

另外,您应该使用 ADO 而不是 DAO 再次测试。 它可能会以完全不同的方式优化操作。

最后,就在上周,有人让我注意到 Andy Baron 撰写的这篇精彩文章:

优化链接到 SQL Server 的 Microsoft Office Access 应用程序

我仍在吸收那篇非常有用的文章的内容,它讨论了与非 GUID 特定主题相关的几个问题,这些问题可能会帮助您优化流程最大效率。

My question is whether your test setup represents the reality of your application or not. In short, are you testing the right thing?

Is your app going to be appending large numbers of records one at a time?

Or is it going to be appending batches of records based on a SQL SELECT?

If the latter, you might look at trying to do it all server-side, particularly if the source table(s) in the SELECT are on the server. It's important to realize that with ODBC, a batch append is going to be sent to the SQL Server as a single insert for every single row (every similar to the recordset-based approach in your test code). If you move the same process entirely server-side, it can be done as a batch operation.

Also, you should test again using ADO instead of DAO. It may optimize the operation completely differently.

Last of all, someone brought to my attention just this past week this fascinating article by Andy Baron:

Optimizing Microsoft Office Access Applications Linked to SQL Server

I'm still absorbing the contents of that very useful article, and it discusses several issues in regard to non-GUID-specific topics that may help you optimize your process for maximum efficiency.

落花随流水 2024-07-23 10:06:16

您意识到性能下降的至少一部分是日志填满了,GUID id 是什么,比 int 长 40 个字节?

但我并没有狡辩; 很高兴看到有人采取实际的衡量标准,而不仅仅是袖手旁观。 改装了。

You realize at least part of the decreasing performance is the log filling up, and that a GUID id what, 40 bytes longer than an int?

But I'm not quibbling; it's good to see someone taking actual metrics rather than just handwaving. Modded up.

淡淡绿茶香 2024-07-23 10:06:16

你从哪里获取数据?

如果您使用“Access 导出”菜单选项而不是“一次循环记录”,数字是否会改变?

VBA 对连接参数也非常敏感,并且有很多选项不一定直观。

如果标识列是可以接受的,那么为什么还要考虑顺序 GUID(这是我上次检查的 MSSQL 中的附加设施)。


EDIT:
Looking at your code and briefly reviewing the Recordset docs on MSDN, I see you may be able to use more efficient parameters. E.g. your dbSeeChanges and dbOpenDynaset, which are appropriate if you are trying to allow for other users messing with the same rows (or needing to get back the inserted IDENTITY value or probably GUID), but I don't think you need those. In essence, after every INSERT or UPDATE, you're reading the record back from the database into VBA. I'd read through those connection config settings carefully, and I bet you'll come up with something a lot more satisfactory.

Where are you getting the data from?

Does it change the numbers if you use the Access Export menu options rather than record-at-a-time-in-a-loop?

VBA is really sensitive to the connection paramters too, and there are lots of options that aren't necessarily intuitive.

If an identity column is acceptable, why are you even considering a sequential GUID (which is something of a tacked-on facility in MSSQL last I checked).


EDIT:
Looking at your code and briefly reviewing the Recordset docs on MSDN, I see you may be able to use more efficient parameters. E.g. your dbSeeChanges and dbOpenDynaset, which are appropriate if you are trying to allow for other users messing with the same rows (or needing to get back the inserted IDENTITY value or probably GUID), but I don't think you need those. In essence, after every INSERT or UPDATE, you're reading the record back from the database into VBA. I'd read through those connection config settings carefully, and I bet you'll come up with something a lot more satisfactory.

菊凝晚露 2024-07-23 10:06:16

上次我看到类似的情况(使用 GUID PK 插入速度非常慢)是因为日志文件已满。 插入性能像石头一样下降,速度相当快(没有硬测量,只是查看实时痕迹,但它看起来确实有点像对数)。 这是历史数据的预加载。
转移到身份 PK,负责实际清理日志文件,之后一切都变得更好(几个小时,第一个版本花了几个小时而且还没有完成)。

另外,想一想,是否涉及任何交易? 也许 SQL Server 事务会造成访问所没有的巨大性能损失(假设访问并不是真正适合并发访问)。

The last time I saw something like that (really slow insertion with GUID PK) was because of the log-file filling up. Insertion performance was dropping like a stone, pretty fast (no hard measurement, just looking at live traces, but it sure looked like it was kinda logarithmic). This was pre-loading of historical data.
Moved over to identity PK, took care of actually cleaning up the log file, and everything went much better afterwards (a couple of hours where the first version took several hours and was not finished).

Also, just a thought, are there any transactions involved? Maybe SQL Server transactions create a big performance hit that access does not have (given that access is not really geared towards concurrent access).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文