数千个表和数百万条记录哪个对 Access 数据库性能影响更大?
我们使用 Access 数据库作为我们软件产品的后端。该程序已经在一家公司进行了大约 2 年的 alpha/beta 测试,我们注意到我们的一个表当时已填充了超过十万条记录。这可能不是我们产品能够承受的最频繁使用的例子,我们担心的是未来 5-10 年的性能。
我们有一种合乎逻辑的方法可以将这个巨大的表分解为包含数百条记录的数千个表,但我认为该解决方案不太可能有助于解决任何可能的减速问题,因为数据库将因表而不是数据而变得臃肿(虽然我没有接受过数据库方面的正式培训,所以我知道什么)。
我希望有人比我更有见识,可以提供一些见解,了解我们是否可能会看到显着的放缓,如果是,哪种解决方案可能会在长期内保持更好的性能?
We're using an Access Database as the back-end to our software product. The program has been alpha/beta tested at a company for about 2 years now and we've noted that one of our tables has been filled with over a hundred thousand records in that time. This is probably not an example of the heaviest usage that our product will endure, and we're concerned about performance 5-10 years down the road.
There's a logical way for us to break down this enormous table into several thousand tables containing a few hundred records, but I don't think this solution is likely to help with any possible slowdown as the database will become bloated with tables instead of data (Though I have no formal training in databases, so what do I know).
I was hoping someone more informed than I might provide some insight into whether we're likely to see a significant slowdown and if so, which solution is likely to keep performance better in the long-term?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
数据库通常经过优化以处理大量行;问题是,您能够维护数千个几乎相同的表吗? (很少有人可以,编码起来很复杂)
首先,测试可能的场景。我不熟悉你的数据,所以我无法告诉你数百万行对于数据库来说是否太多(毕竟,这是MS Access,而不是真正的数据库)。
如果发现您的表大小存在问题,并且您的数据集可以分为较少使用的(较旧的?)和最近的数据,我建议将表分成两部分:table 和 table_archived(其中包含不常用/较旧的记录) 。这可能是表大小和可管理性之间的合理折衷。
Databases are generally optimized to deal with a high number of rows; the question is, will you be able to maintain thousands of almost-identical tables? (Few can, it is complicated to code with)
First of all, test the possible scenarios. I'm not familiar with your data, so I can't tell you if millions of rows will be too much for the db (after all, this is MS Access, not a real database) or not.
If find that you have problems with table size, and your datasets can be divided into less used (older?) and recent data, I'd suggest splitting tables in two: table and table_archived (which contains the less frequently used/older records). This could be a reasonable compromise between table size and manageability.
这个问题是一个架构问题,如果您正在考虑的表分区不适合实际数据,那么它会加剧性能问题,而不是改善它们。关于 2GB 文件大小限制,如何对数据进行切片和切块可能并不重要——如果您接近该限制(我想说,在 50% 以内),您确实需要有一个心中的升迁之路。
关于 Jet/ACE 数据存储的问题,我想说,任何拥有包含数百条或数千条记录的表的应用程序都应该评估是否需要扩大规模。如果有可能/可能拥有数百万条记录,我会说这是显而易见的——扩大规模。
这并不是因为 Jet/ACE 有任何不足,而是因为随着需求的变化,相应的技术也会发生变化。一对已婚夫妇在结婚时可能会发现一辆 Mini Cooper 很好,而且它可能适合他们的第一个孩子,但如果他们正在考虑再养几个孩子,他们应该认真考虑购买一辆更大的车 - 不是因为有什么问题与迷你库珀,但因为他们已经不再适合它的最佳用途。
The question is a schema question and if the table partitioning you're contemplating is not a natural fit for the actual data, it's going to exacerbate performance problems, not ameliorate them. In regard to the 2GB file size limitation, it doesn't likely matter how you slice and dice the data -- if you're approaching that limit (within 50% of it, I'd say), you really need to have an upsizing path in mind.
On the question of a Jet/ACE data store, I would say that any app that has tables with 100s of thousands of records is already one that should be evaluated for upsizing. If it's possible/likely to have millions of records, I'd say it's a no-brainer -- upsize.
This is not because of any inadquacy of Jet/ACE, just because as requirements change, appropriate technology changes. A married couple might find a Mini Cooper fine when they get married, and it might accomodate their first child just fine, but if they are contemplating a couple more children, they should really seriously consider getting a larger car -- not because there's something wrong with a Mini Cooper, but because they've outgrown what it's best for.
听起来把表格分解得太多了,但是 水平分区是一种非常完善的性能优化策略,在许多数据库平台中都有使用。
使用 MS Access,即使在读取数百万行数据时,您也不会看到设计良好的数据库的性能下降。此外,如果您经常压缩和修复表,那么拥有大量表不会给您带来太大的性能问题,但更大的问题是维护的复杂性。我想说不要拆分表,直到它至少有一百万行和/或显示该表上的查询的性能问题。
问题在于:如果用户不断地查询某个分区中的多个表(您需要将这些表合并在一起),这种类型的分区会极大地损害性能。在分区包含不经常搜索的档案记录的情况下,它的效果会更好。如果您认为需要频繁跨表查询,请不要去那里。
您最大的可扩展性障碍将与用户数量有关。如果您预计有 100 个用户,您需要非常仔细地计划,或者可能考虑使用客户端-服务器数据库后端。
Sounds like overkill to break down the table that much, but horizontal partitioning is a very sound performance optimization strategy that is used in many database platforms.
With MS Access you aren't going to see much performance degradation on a well designed database for reads, even with millions of rows. Also, having a lot of tables won't cause you much in terms of performance problems either if you compact and repair it frequently, but the bigger issue is the complexity of maintenance. I'd say not to split the table until it has at least a million rows in and/or is showing performance issues with queries on that table.
Here's the rub: This type of partitioning can hurt performance considerably if the users are constantly querying across multiple tables in a partition that you need to UNION back together. In situations where the partitions contain archival records that aren't searched very often it works much better. If you think you will need to frequently query across tables, don't go there.
Your biggest scalability hurdle is going to be related to the number of users. If you are anticipating 100's of users you need to plan very carefully or perhaps consider a client-server database backend.
在过去大约 10 年里,微软建议人们不要使用 Access 作为数据库,而是使用各种版本的 SQL Server。
鉴于过去的发展 - 嗯 - 10 年我不会。我非常担心 10 年后 Access 是否实际上仍然能够存储数据,或者在其间的某一时刻调用是否是“sql server 程序”。
Access 能够很好地处理一百万或五百万条记录。 SQL Server 可以很好地处理数十亿条记录。当你遇到 Access 问题的那一刻,基本上,你遇到的任何问题都是基于——我真的找不到更好的说法——巨大的无知,甚至尝试将 Access 用于一个严肃的数据库,就像我一样已经说过 - MS 在过去 10 年里一直不鼓励这样做。
将数千个表拆分为一个表是不明智的; SQL 数据库不是为此而设计的。即使在 SQL Server Enterprise 中使用聚集表(正是这样做)也并不是真正针对拥有数万个分区的您。
你更有可能死在access中——access根本就不是数据库服务器。回到绘图板。
也就是说,大约 18 年前,Access 添加了一些通过 FoxPro 获得的技术,使其能够轻松处理具有数百万条记录(不是数千万条而是数亿条)的表,因此此时您非常安全(除了尝试对类似的事情进行数据库修复、备份等,甚至是通过网络共享运行多用户应用程序的噩梦,
我有一个表,目前有大约 6.5 亿条记录,增长到大约 10 或 20 条 。数据加载开始后 6 个月内将达到 10 亿美元,到目前为止没有任何问题。
For the last about 10 years Microsoft has advised people NOT to use Access as database but to go with SQL Server in various versions.
Given developments of the lat - hm - 10 years I would not be. I would seriously be concerned whether Access is actually still able to store data in 10 years down the road at all, or whether the call is "program for sql server" at one point in between.
Access is well able to handle a million or 5 million records. SQL Server goes well into BILLIONS of records. The moment you get into problems with Access, basically, you earn any problems you get based on the - and I really find no way to say it more nicely - the tremendous ignorance to even try using access for a serious database, as - as I already said - MS is discouraging this for the last 10 years.
THOUSANDS of tables to split off a table is unwise; SQL databases are not designed for this. Even using clustered tables in SQL Server Enterprise (doing exactly this) is not really targeting you having tens of thousands of partitions.
You are a LOT more likely to just die in access - access is simply not a database server. Back to the drawing board.
That said, Access about 18 years ago or so added some technology acquired with FoxPro allowing it to easily handle tables with millions of records (not tens but hundreds of millions), so you are very safe at this moment (except the nightmare of trying to do a db repair, backup etc. on something like that, or even the nightmare of running a multi-user application through a network share.
SQL Server, otoh, I have a table currently at around 650 million records growing to about 10 or 20 billion in the next 6 months when data loads start, and no problems so far.
我将避免在此线程中进入访问与 SQL 服务器的争论,而只是回答 OP 的问题。
如果数据可以拆分并且人们不会跨这些拆分进行查询,那么这可能是一个值得测试的选项,但是访问中打开的表有 2048 个限制,因此您可能需要注意这一点。
然而,之前有人说过,如果你必须问某件事的最大数量是多少,那么你很可能做错了,我认为这就是一个例子。如果将其分成 10 个表,可能会是数千个?我会把那个传下去
I’m going to steer clear of entering the access –v- SQL server debate in this thread and instead just answer the OP’s question.
If the data can be split and people wont be querying across those splits then it might be an option worth testing however there is a limit of 2048 open tables in access so you might want to watch out for that.
Its been said before however that if you have to ask what is the maximum number of something then chances are you are doing it wrong, I think this is an example of that. If it was splitting it into 10 tables maybe but thousands? I’ll pass on that one