(难题)如何将表的特定行存储在不同的sql服务器中?
我在这里遇到了一些架构问题。 假设我有两个表,教师和学生,它们都位于不同的服务器上。 由于这些表共享大量数据和功能,我想使用 这个继承方案并创建一个People表; 但是,我需要将教师表和与教师相关的人员记录保留在一台服务器中,并将学生保留在一台服务器中。 > 表以及另一台服务器中与学生相关的人员记录。 这是首席开发人员提出的要求,因为我们有太多(我的意思是太多)教师和学生记录,并且单个数据库包含所有人们会崩溃。 此外,客户端需要将它们放在单独的服务器上(叹气*)。
我真的很想实现继承方案,因为很多功能可以在数据库之间共享。 有没有可能的方法来做到这一点? 还有其他架构可以解决此类问题吗? 我只是疯了吗?
--- 编辑 ---
好吧,我本身并没有真正的教师和学生,我只是使用这些名称来简化我的解释。 事实是,大约有 9 个子表将继承超级表,所有这些子表都位于单独的服务器中用于单独的应用程序,不,我没有 这种类型的数据库,但对于我们拥有的事务量来说,我们拥有相当低端的服务器;)。 你是对的,我的说法有点夸张,对此我表示歉意,这只是为了让你们更快地回答(抱歉:P)。 不同的服务器更多地是一种业务限制(尽管首席开发人员确实说过,存储 SuperTable 的通用数据库会在其自身重量下崩溃 - 他的话,不是我的:S)。 我们的客户不喜欢他们的信息与其他客户的信息混合在一起,所以我们必须将他们的信息放在不同的服务器上——这很愚蠢,但决策者已经说过了:(。
I have a bit of an architecture problem here. Say I have two tables, Teacher and Student, both of them on separate servers. Since this tables share a lot of data and functionality, I would like to use this inheritance scheme and create a People table; however, I would need tho keep the Teacher table and the People records relating Teacher in one server, and the Student table and the People records relating Student in another server. This was a requirement made by the lead developer, since we have too many (and I mean too many) records for Teacher and Student, and a single database containing all of the People would collapse. Moreover, the clients NEED to have them on separate servers (sigh*).
I would really like to implement the inheritance scheme, since a lot of the funcionality could be shared among the databases. Is there any possible way to do this? any other architecture that may suit this type of problem? I'm I just crazy?
--- EDIT ---
Ok, I don't really have Teachers and Students per se, I just used those names to simplify my explanation. Truth is, there are about 9 sub-tables that would inherit the super table, all of them in separate servers for separate applications, and no, I don't have this type of database, but we have pretty low end servers for the amount of transactions we have ;). You're right, my statements are a bit exagerated and I apologize for that, it was just to make you guys answer faster (sorry :P). The different servers are more of a business restriction than anything else (although the lead developer DID say that a common database to store the SuperTable would collapse under it's own weight -his words, not mine :S). Our clients don't like their information mixed with other clients information, so we must have their information on different servers -pretty stupid, but the decision-makers have spoken :(.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
您在什么假设下确定自己拥有太多数据? 我非常确定您可以列出世界上的每一位老师和学生,并且不会给 SQL Server 带来任何麻烦。
这似乎是一个任意的决定,会对您设计的任何解决方案的复杂性产生重大影响。
看看这里 - 我确定你没有测量你的数据库任何接近本页所示规模的数据库,其中许多数据库都在 SQL Server 上运行。
Under what assumption did you determine that you have too much data? I'm pretty sure you could list every teacher and student in the world, and not cause SQL Server any grief.
This seems like an arbitrary decision that is going to have significant impact on the complexity of any solution you design.
Take a look here - I'm sure you don't measure your database in anything close to the scale represented on this page, and many of these db's are running on SQL Server.
我不确定这对于 SQL Server 是否可行,但它听起来像是可以通过集群和表空间分区来解决的问题。
我想知道这是否真的是一个好的要求? 它引入了很多技术复杂性,基于一个非常简单的断言:数据太多了。 您是否尝试过验证这一点? 一个简单的测试是创建一个简单的模式,并用虚拟数据填充它,以达到您在生产中期望的行数。 在您在实现此“要求”方面走得太远之前,执行此测试可能符合您的最佳利益。
顺便说一句,您链接到的架构类型是类表继承模式。
您可以为此项目实施域模型,其中公共属性
Teacher
和Student
是由编写公共操作所针对的Person
接口或基类来描述的。 如果您计划广泛使用存储过程,这可能不是一个有用的选项,但值得考虑。I don't know for sure if this is possible with SQL Server specifically, but it smells like something that could be solved with clustering and tablespace partitioning.
What I wonder about is whether this is really a good requirement; it introduces a lot of technical complexity based on a pretty simple assertion that there's just too much data. Have you attempted to verify this? A simple test would be to create a simple schema and populate it with dummy data for the number of rows you expect in production. It would probably be in your best interest to perform this test before you go too far down the road to implement this 'requirement'.
By the way, the type of schema you linked to is an example of the class table inheritance pattern.
It would be possible for you to implement a domain model for this project where the common attributes of
Teacher
andStudent
are described by aPerson
interface or base class which the common operations are written against. If you plan to use stored procedures extensively, this might not be a useful option, but it's something to consider.我认为保罗是正确的 - 也许看看你的硬件基础设施而不是你的数据库模式。
使用集群、适当的索引以及可能的数据归档方案应该可以解决任何性能问题。 继承方案似乎是最好的数据模型。
可以将数据拆分到多个服务器上并保留该方案,但我认为与查看集群/正确的索引相比,您肯定会遇到更多的性能问题。 通过设置链接服务器,您可以进行跨服务器查询。
例如,学生查询
--编辑-- 正如保罗所说,您可以在抽象层中执行数据库分离。
例如,让您的 Student 类扩展您的 Person 类。 在您的 Person 类构造函数中,让它连接到服务器 A 以填充可用的字段。 在您的学生类构造函数中,将其连接到服务器 B(Person 属性已由 Person 构造函数填充)。
I think Paul is correct - perhaps look at your hardware infrastructure rather than your DB schema.
Using clustering, proper indexing, and possibly a data archive scheme should solve any performance problems. The inheritance scheme seems to be the best data model.
It is possible to split the data over multiple servers and keep the scheme, but I think you'd definitely have more performance problems than if you looked at clustering/proper indexing. By setting up linked servers you can do cross-server queries.
e.g. Students query
--EDIT-- As Paul said, you could perform your database separation in your abstraction layer.
E.g. have your Student class extend your Person class. In your Person class constructor, have it connect to Server A to populate whichever fields are available. In your student class constructor, have it connect to Server B (the Person attributes will already be populated by the Person constructor).
我和亚伦在一起(sup Aaron)。 将表移至单个数据库中。 SQL Server 可以轻松处理每个表数十亿行(我在 6-7 年前就在 SQL 2000 上做到了这一点,所以现代版本和现代硬件都没有问题)。 只要您的表索引正确 世界上每所学校可能一直都没有足够的学生来使 SQL Server 超载,更不用说单个学校了。
在这种情况下,您的最佳实践是将表放在同一服务器上的同一个数据库中,并对它们建立索引以获得更好的性能。
I'm with Aaron here (sup Aaron). Move the tables into a single database. SQL Server can easily handle billions of rows per table (I've done it on SQL 2000 6-7 years ago, so modern versions and modern hardware are no problem). As long as your tables are indexed correctly There probably haven't been enough students in all of time at every school in the world to overload SQL Server much less at a single school.
In this case your best practice would be to put the tables in the same database, on the same server and index them for better performance.
记录太多导致“数据库崩溃”? 首席开发人员抽的是什么样的锅? 强效的东西!
我建议你们首先学习分区表。 使应用程序成为分布式(这实际上是两个服务器方法所暗示的)比您想象的要困难得多,并且它不提供可扩展性。
Too many records cause 'database collapse'? What kind of pot is that lead developer smoking? Potent stuff!
I would recommend you guys study partitioned tables first. Making an application distributed (which really the two server approach implies) is much much harder than you think and it does not provide scalability.
是的,我必须同意这里其他人的观点,单一数据库、单一服务器就可以了。 当前扩展硬件以支持工作负载比扩展到联合服务器要容易得多且便宜得多。 我只知道一个地方有联合服务器,而且它们的工作量是惊人的。
Yep, I'd have to agree with the others here, and single database, single server is just fine. It is far easier and cheaper to scale up your hardware currently to support the workload than it will be to scale out to federated servers. I only know of one place that does federated servers and their workload is phenomenal.
链接服务器并创建视图
link the servers and create a view
您使用什么类型的客户端? 如果您使用的是 Java 客户端,并且正在使用 ORM,您可能需要查看 Hibernate Shards。
What kind of client are you using? If you're using a Java client, and are using ORM, you may want to look into Hibernate Shards.
除了这里所有好的答案之外,问题背后的假设是非常值得怀疑的,如果我需要认真地做这件事(并且如果我认为这些假设是正确的),我会比较 Oracle 必须提供的内容,因为它是在这种类型的场景中它显示出一个好处(我是根据经验这么说的)。
但在核心问题上,假设你概述的假设是正确的,我不会尝试建立一个合并表。 如果教师和学生不能位于同一个数据库中,那么他们的身份信息也不太可能位于同一个数据库中,而如果数据量巨大,那么将其全部放在一张表中会更糟糕。
我怀疑的是,如果基本假设是正确的,那是因为预计表上会有大量争用以及表上有大量连接和活动,从而导致大量锁定。 在这种情况下,添加 Person 表会让事情变得更糟。
话虽这么说,如果您仍然确实想这样做,那么您可以通过链接数据库在查询中从一个数据库引用另一个数据库。
但如果真正的问题是连接数以及表周围的争用和死锁,那么这样的解决方案会使事情变得更糟。
编辑:为了回答那些质疑 Oracle 将为这种情况带来什么优势的人,我们将在联合数据库领域进行回答,该领域更加成熟。 另一种情况是在存在大量争用的表中,它在某些情况下会复制数据,并且通常在处理争用时其模型更加复杂。 例如,在运行时间较长的查询中读取表的情况,会导致大量潜在的读锁。 Oracle 帮助您保持事务完整性,而无需锁定读取。 在 MS-SQL 中,您必须诉诸脏读。
MS-SQL 是一个很好的数据库,但它有其局限性(没有任何有关读取和写入量的特定参数的原始数据量并不是真正的其中之一,但这使问题变得奇怪)。 考虑到激烈的竞争,Oracle 的非企业版在价格上确实足够接近,值得一看。 以后可能会让你付出很多代价。
当然,如果你已经购买了MS-SQL许可证,那么Oracle的成本因素就更大,所以好处也一定更明显。
Besides all the good answers here that the assumptions behind the question are highly questionable, if I needed to do this seriously (and if I take the assumptions as true) I would compare what Oracle had to offer, because it is in this type of scenario that it shows a benefit (I say this from experience).
But on the core question, assuming that the assumptions you outline are true, I would not try to have a combined table. If teachers and students can't be in the same database, it is unlikely that their identifying information can, and if the amount of data is overwhelming, then putting it all in one table is worse.
What I suspect is that if the underlying assumptions are true it is because there is an anticipation of a lot of contention on the tables and a lot of connections and activity on the tables, causing a lot of locks. In that case, adding a Person table will make things worse.
All that being said, if you still really wanted to do it, then you can reference one database from another in queries, via linked databases.
But if the real issues is number of connections and contention and deadlocks around the tables, such a solution would make things worse.
EDIT: In response to those who question what advantage Oracle would bring to such a situation, one would be in the federated database area, where it is much more mature. Another would be in tables where you have a high amount of contention, it makes copies of the data in certain situations, and in general its model is more sophisticated when it comes to handling contention. For example scenarios where tables are read in longer running queries, causing a lot of potential read locks. Oracle helps you keep transactional integrity without having to lock on read. In MS-SQL, you have to resort to dirty reads.
MS-SQL is a fine database, but it has its limits (raw amounts of data without any particular parameters about volume of reads and writes is not really one of them, though, which makes the question strange). And given the stiff competition, the non-Enterprise version of Oracle is really close enough in price to be worth a look. It could end up costing you a lot later.
Of course, if you already purchased an MS-SQL license, the cost factor is larger for Oracle, so the benifits have to be more obvious.