创建数据库以实现可扩展性

发布于 2024-08-10 06:34:19 字数 639 浏览 2 评论 0原文

如何创建可扩展性数据库?我正在 http://www.slideshare 中间。 net/vishnu/livejournals-backend-a-history-of-scaling 我无法读取 ATM 并需要离开。但我想更多地了解如何创建一个可扩展的数据库。它提到并出现在我脑海中的是

  • 读取和写入的单独句柄?
  • 当一台服务器繁忙(IO 或 CPU 限制)并且我需要两台服务器进行写入时会发生什么?
  • 我要创建多个数据库吗?用户有 clusterId 吗?
  • 将用户从一个集群转移到另一个集群时会出现问题吗?
  • 我可以对此进行编码,以便集群 A 上的 DB A 中的用户 ABC 和集群 B 中的 DB B 中的 DEF 具有相同的主键吗?
  • 当我将上面的内容移至集群 C 时?这是否意味着我需要编写大量代码才能将它们移动到另一个集群/数据库?
  • 为了使上述问题不再成为问题,我是否不使用 PRIMARY KEY 并通过读取其他集群上的其他数据库来手动设置 ID?

ETC

How do i create a database for scalability? I am in the middle of http://www.slideshare.net/vishnu/livejournals-backend-a-history-of-scaling which i cant read ATM and need to leave. But i would like to know more about creating a database that scales well. Somethings that it mentioned and occur in my mind are

  • Separate handles for reads and writes?
  • What happens when one server is busy (IO or CPU bound) and i need two servers to write to?
  • Do i create multiple database? have a clusterId on users?
  • Will it be a problem when moving users to one cluster to another?
  • Might i code this so user ABC in DB A on cluster A and DEF in DB B in cluster B have the same PRIMARY KEY?
  • When i move the above to cluster C? Does this mean i need to write much code to move them to another cluster/database?
  • To make the above not an issue would i NOT use PRIMARY KEY and set the ID by hand by reading the other DBs on other clusters?

etc

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

恍梦境° 2024-08-17 06:34:19

要创建一个能够很好地适应 99.9% 用例的数据库,不必费心处理这些事情。相反,设计一个适当规范化的模式;使用主键、外键和其他约束来确保完整性;索引表很好。研究 DBMS 供应商关于性能和可扩展性主题(例如分区、不同的表和索引结构等)的建议,并使用最适合您的情况的选项(基准选项来证明它们可以提高可扩展性)。

当然,如果你在 Google、Ebay 或 Amazon 工作,那么你可能会落入 0.1% 阵营,他们需要扔掉规则手册,去做你正在读到的所有这些疯狂的事情。但我猜你不这么认为,对吧?

To create a database that scales well for 99.9% of use cases, don't bother with any of that stuff. Instead, design a properly normalised schema; use primary, foreign key and other constraints to ensure integrity; index tables well. Study your DBMS vendor's advice on performance and scalability topics such as partitioning, different table and index structures etc. and use what works best for your case (benchmark options to prove that they improve scalability).

Of course, if you work for Google, Ebay or Amazon then you may fall into the 0.1% camp that needs to throw away the rule book and do all this crazy stuff you are reading about. But I'm guessing you don't, right?

未央 2024-08-17 06:34:19

RDBMS 非常适合保持一致性和事务性数据,但它们需要大量的专家规划才能扩展到每秒数百或数千个事务。我会构建一个 nosql 云来将从 RDBMS 构建的文档转储到其中。

因此,您将 RDBMS 用于原始数据,并使用 nosql 数据库作为 RDBMS 上的视图

RDBMS are great for keeping consistant and transactional data, but they require lots of expert planning to scale to 100's of thousands of transactions per second. I would build a nosql cloud to dump documents built from a RDBMS into.

So you use a RDBMS for the raw data and the nosql databases for the views on the RDBMS'

岁月蹉跎了容颜 2024-08-17 06:34:19

当一台服务器繁忙(IO 或 CPU 密集)并且我需要两台服务器进行写入时会发生什么?

如果您正在执行分布式事务,那么您就会遇到麻烦,因此您必须提前计划确保分布式事务目标服务器上的负载是均匀的。

我要创建多个数据库吗?用户有 clusterId?

这是一个非常好的解决方案:P。您必须使共享数据模型正确,这样才不会在共享目录上形成瓶颈

将用户从一个集群移动到另一个集群时会出现问题吗?

不会,分布式事务胜利。你需要有一个强大的程序员来确保事情正确发生。

我可以对此进行编码,以便集群 A 上的 DB A 中的用户 ABC 和集群 B 中的 DB B 中的 DEF 具有相同的主键吗?

不,在主 RDBMS/LDAP 服务器上分配主键。您不希望出现这种主键冲突。您选择的方法取决于是否正确完成 - 您需要全局唯一的用户 ID。在这种情况下,您将拥有共享数据,如果您没有 GU-PK,您将如何将用户的数据与共享数据关联起来?

What happens when one server is busy (IO or CPU bound) and i need two servers to write to?

If you are doing a distributed transaction, well you are in trouble so you have to plan ahead to make sure load across your distributed transaction target servers is uniform.

Do i create multiple database? have a clusterId on users?

This is a very nice solution :P. You have to get the shared-data data models correct so that you don't form a bottleneck on your shared catalogue's

Will it be a problem when moving users to one cluster to another?

No, distributed transactions for the win. You need to have a kickass programmer to make sure things happen correctly.

Might i code this so user ABC in DB A on cluster A and DEF in DB B in cluster B have the same PRIMARY KEY?

No, assign the primary key on a master RDBMS/LDAP server. You do not want primary-key collisions of this sort. Your chosen method depends on this being done correctly -- you want globally unique user-id's. You will have shared-data in this case, and if you do not have have GU-PK's how will you relate the user's to the shared data ?

陪我终i 2024-08-17 06:34:19

为了补充 Tony 的建议,我想说的是,将数据库正确分区为目录(SQL Server 术语,表示物理数据库服务器内的虚拟数据库命名空间),并尝试最小化目录之间的依赖关系,即查询级别依赖关系。如果存在依赖项,请确保它们是只读的。

这将允许您在需要时将目录移动到不同的物理服务器。只读的要求是,如果您将某个目录从某个服务器移走,而该服务器对另一个目录(同一物理服务器上)具有只读依赖性,您可以继续将相关数据复制到您要将特定目录移至的新物理服务器上的只读目录。

存在只读要求是因为复制通常是一种单向功能。这意味着您只能将一台服务器作为写入主服务器,而其他服务器仅接收数据以在本地读取数据。

有关复制的建议对于最坏的情况非常有用,并且仅适用于一次。它不是临时数据库增长的解决方案。如果您必须以这种方式发展,您应该放弃 RDBMS。有了正确的数据模型,目录的复制自由移动就成为可能

To add to Tony's advice, I would say that partition your databases correctly into catalogues (the SQL Server term for a virtual databases namespace inside a physical database server), and try to minimise the dependencies between catalogues -- i.e., the query level dependencies. If there are dependancies make sure they are read-only.

This will allow you to move catalogues to different physical servers when needed. The requirement for read-only is so that if you move a catalogue away from a certain server on which it has a read-only dependancy on another catalogue (on the same physical server), you can go on to replicate the data in question to a read-only catalogue on the new physical server to which you are moving a certain catalogue.

The read-only requirement is present because replication is generally a one-way feature. That means that you can only have one server as a write-master and other servers just receive the data for the purpose of reading from it locally.

The advice about replication is really usefull for worst case scenario and only for doing once. It is not a solution for ad-hoc database growth. You should move away from RDBMS if you ever have to grow this way. With the correct data models replication free movement of catalogue's is possible

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文