从多表和数据库查询数据的最佳方法是什么?
我有 5 个数据库,代表该国的不同地区。 每个数据库中有几百个表,每个表有10,000-2,000,000条交易记录。 每个表代表相应区域的客户。 这些表中的每一个都具有相同的架构。
我想查询所有表,就好像它们是一张表一样。 我能想到的唯一方法是创建一个联合所有表的视图,然后对其运行查询。 但是,客户表会一直发生变化(当我们获得和失去客户时),因此我必须更改视图的查询以包含新表(或删除不再使用的表)。
有没有更好的办法?
编辑
为了回应评论,(我也发布了这个作为对答案的回应):
在大多数情况下,我不会删除任何表格,它们将保留用于历史目的。 正如我在对一条回复的评论中所发表的那样,我们的想法是减少较小客户(只有 10,000 条记录的客户)查询自己的历史记录所需的时间。 大约有 1000 个客户,平均每个客户有 1,000,000 行(并且还在不断增加)。 如果我将所有记录添加到一个表中,该表中将有近十亿条记录。 我还以为我正在为未来做计划,因为当我们有 5000 个客户时,我们没有一张巨大的表来保存所有交易记录(这可能是我的想法的错误)。 那么,是不是不要像我那样划分记录比较好呢? 我应该将它们全部混入一张表中吗? 对客户 ID 建立索引是否可以防止小客户数据查询时出现延迟?
I have 5 databases which represent different regions of the country. In each database, there are a few hundred tables, each with 10,000-2,000,000 transaction records. Each table is a representation of a customer in the respective region. Each of these tables has the same schema.
I want to query all tables as if they were one table. The only way I can think of doing it is creating a view that unions all tables, and then just running my queries against that. However, the customer tables will change all the time (as we gain and lose customers), so I'd have to change the query for my view to include new tables (or remove ones that are no longer used).
Is there a better way?
EDIT
In response to the comments, (I also posted this as a response to an answer):
In most cases, I won't be removing any tables, they will remain for historic purposes. As I posted in comment to one response, the idea was to reduce the time it takes a smaller customers (one with only 10,000 records) to query their own history. There are about 1000 customers with an average of 1,000,000 rows (and growing) a piece. If I were to add all records to one table, I'd have nearly a billion records in that table. I also thought I was planning for the future, in that when we get say 5000 customers, we don't have one giant table holding all transaction records (this may be an error in my thinking). So then, is it better not to divide the records as I have done? Should I mash it all into one table? Will indexing on customer Id's prevent delays in querying data for smaller customers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我认为你的设计可能被破坏了。 为什么不使用一个包含区域和客户列的单一表?
如果我是您,我会考虑重构为一个表,如果有必要(例如为了反向兼容性),我会使用视图提供与前面的表中相同的信息。
编辑以回答对这篇文章的OP评论:
只要您使用正确的索引,一张包含 10 000 000 000 行的表就可以了。 数据库服务器就是为了应对这种数据量而构建的。
性能绝对不是将一个这样的表拆分为数千个较小的表的正当理由!
I think your design may be broken. Why not use one single table with a region and a customer column?
If I were you, I would consider refactoring to one single table, and if necessary (for reverse compatibility for example), I would use views to provide the same info as in the previous tables.
Edit to answer OP comments to this post :
One table with 10 000 000 000 rows in it will do just fine, provided you use proper indexing. Database servers are built to cope with this kind of volume.
Performance is definitely not a valid reason to split one such table into thousands of smaller ones !
如果有几百个表并且每个表都有相同的架构,那么这个系统的架构听起来像是需要一种截然不同的方法。
为什么要添加或删除表呢? 在任何正常情况下都不应该发生这种情况。
The architecture of this system smells like it needs a vastly different approach if there are a few hundred tables and each has the same schema
Why are you adding or removing tables at all? This should not be happening under any normal circumstances.
同意 Brann 的观点,
这是一个疯狂的数据库架构设计。 为什么您不使用(或者可以选择更改为)单个规范化结构,其中包含按区域进行过滤的列以及区域数据库中分隔每个表的任何条件。
在该结构中,您会遇到一些非常大(约 500 个表)的联合视图,您必须在系统中出现新表时定期动态重新生成这些视图。
Agree with Brann,
That's an insane DB Schema Design. Why didn't you go with (or is an option to change to) a single normalised structure with columns to filter by region and whatever condition separates each table within a region database.
In that structure you're stuck with some horribly large (~500 tables) unioned view that you would have to dynamically regenerate as regularly as new tables appear in the system.
2个解决方案
1. 编写一个存储过程,通过解析 5 个数据库中的所有表名来为您构建视图,并使用 union 构建视图,就像您手动构建视图一样。
2 solutions
1. write a stored procedure who build the view for you by parsing all table names in the 5 databases and build the view with union as you would do it by hand.
听起来您陷入了多租户数据库模式和单租户数据库模式之间的困境。 具体来说,您将其存储为“轻型”多租户(单独的表与单独的数据库),但将其作为单租户进行查询,一个查询即可规则所有这些。
从短期来看,让您的数据访问层动态选择要查询的表,而不是将所有内容联合起来进行一个超级查询。
从长远来看,选择一种方法并坚持下去。 一库一表或者多库。
以下是一些关于该主题的帖子。
使用单个数据库有哪些优点每个客户端的数据库?
http://msdn.microsoft.com /en-us/library/aa479086.aspx
Sounds like your stuck somewhere between a multi and single tenant database shema. Specifically your storing it as "light"multi-tenant (separate tables vs separate databases) but querying it as single-tenant, one query to rule them all.
In the short term have your data access layer dynamically pick the table to query and not union everything together for one uber query.
In the long term pick one approach and stick too it. One database and one table or many databases.
Here are some posts on the subject.
What are the advantages of using a single database for EACH client?
http://msdn.microsoft.com/en-us/library/aa479086.aspx