对许多站点使用中央数据库服务器:合理吗?

发布于 2024-10-14 19:32:59 字数 376 浏览 1 评论 0原文

基本上,我需要在多达几十个站点上同步数据库数据的某些部分。完美的解决方案是创建一个中央服务器来托管该数据。

每个页面加载都必须从本地和远程数据库服务器获取数据,并且写入远程服务器也很常见。

虽然数据库服务器可以在硬件方面达到所需的速度,但我对瓶颈持谨慎态度:

  • 必须在每个页面加载上建立多个数据库连接。
  • 信号在两个物理位置之间传输的延迟。

我的担心有道理吗?将数据库与 cronjobs 或其他技术同步会更明智吗?


除了悬赏之外,我还希望有一个具有实际经验的专家出现这个问题:

还有哪些其他技术(除了 cron 之外)来同步 MySql 数据库?

Basically, I need some parts of database data synchronized on up to several dozens of sites. The perfect solution would be creating a central server to host that data.

Each pageload will have to fetch data from both database servers - the local and remote one and writes to the remote server will be quite common too.

While the db server can be as fast as desired hardware-wise, I'm cautious of the bottlenecks:

  • Multiple database connections must be established on each pageload.
  • Latency of the signal traveling between two physical locations.

Am I right to worry? Would it be wiser to synch the databases with cronjobs or other technologies?


Along with assigning a bounty, I'm adding to the question hoping an expert with real life experience of this comes along:

What other technologies are there (besides cron) to sync MySql databases?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

反目相谮 2024-10-21 19:32:59

这个问题实际上取决于您的情况,我相信您已经确定了中央数据库解决方案的两个主要问题 - 所以是的,您的担心是正确的。

我个人会选择使用 cron(或您选择的任何方法)将数据同步到服务器 - 减少硬件成本和页面加载时间。对我来说,这是更具技术性的解决方案,但就其优点(更快的页面加载时间、不依赖于中央数据库、更低的成本)而言,这是正确的解决方案。

或者,您始终可以在远程服务器上设置一个小型 MySQL 数据库,并创建一些测试网站并运行一些基准测试,这将为您提供一些关于您是否对加载时间满意的数据。

This questions is really down to your situation and I believe you've identified the main two problems with the central DB solution - so yes, you are right to be concerned.

I would personally opt to sync the data to the servers using a cron (or whatever method you chose) - cutting down on hardware costs and page load times. This to me is the more technical solution but in terms of its benefits (faster page loading times, no dependency on the central DB, lower costs) is the correct solution.

Alternatively, you could always set up a small MySQL database on a remote server and create a few test websites and run some benchmarks, this would give you some data on whether you are happy with the loading times.

浮世清欢 2024-10-21 19:32:59

MySQL 复制绝对是最佳选择。使用单个数据库服务器的问题是,如果负载变得太高,所有站点都会崩溃。您希望尽可能地分散负载,因为如果服务器出现故障或过载,那么最终就会出现大问题。

处理复制时要记住的一些事情

  • 您至少需要 2 个(最好是 3 个或更多,1 个主服务器和 2 个从服务器)数据库服务器。
  • 你永远不会写信给你的从属服务器。所有写入操作都会发送到主服务器,该复制将很快同步从服务器。
  • 您始终从从服务器读取(除非您需要保证拥有最新的数据)。通过分离服务器之间的读写操作,可以显着提高性能。

添加负载平衡服务器,您的数据库负载问题就会消失!

MySQL replication is definitely the way to go. The problem with having a single database server is that if the load becomes too high all your sites will go down. You want to spread the load as much as possible because if a server goes down or becomes overloaded, it's the end all big issue.

Some things to keep in mind when dealing with replication

  • You want at least 2 (preferably 3 or more, 1 master and 2 slaves) database servers.
  • You never write to your slave servers. All write operations go to the master, which replication will sync the slaves soon after.
  • You always read from the slave servers (unless you need to guarantee you have the most up to date data). By separating read and write operations between servers, you can significantly improve performance.

Throw in a load balancing server and your database load woes go away!

猛虎独行 2024-10-21 19:32:59

任何建议离线同步的方法都在浪费mysql复制的好处

(考虑到您提到的不清楚的情况)
您的解决方案可以很简单,只需将读/写分开,

这意味着在本地数据库上,

  1. 确保本地仅对您想要从集中式数据库同步的数据库启用读取
  2. ,写入操作提交到集中式数据库(而不是本地数据库)
  3. 集中数据库会将更新复制到所有本地数据库

问题

  1. 由于网络延迟而导致复制延迟

好处

  1. 数据完整性,因为写操作只能由集中服务器完成并使用复制来复制对各种本地数据库的更改
  2. 本地数据库可以允许单独的写入操作(另一组数据/数据库)
  3. 从本地数据库读取比集中式数据库快得多(考虑读取操作比写入操作更频繁)

any method that suggest offline sync is wasting the benefits of mysql replication

(given the unclear situation you have mentioned)
your solution can be as simple as keep READ/WRITE separately

that's mean on the local database,

  1. make sure local is read-enabled only to the database you wanted to sync from centralized database
  2. write operation is commit to centralized database (instead to local database)
  3. centralize database will than replicate the update to all local databases

problem

  1. replication delay due to network latency

benefits

  1. data integrity as write operation can only be done of centralize server and using replication to copy changes to various local database
  2. local database is possible to allow individual write operation (another set of data/database)
  3. read from local database is much faster than centralized database (consider read operation is more frequently than write operation)
葮薆情 2024-10-21 19:32:59

Google 解决此问题的方式(您可以获取一些信息 这里。很抱歉,我没有描述它的实际发表论文的链接)或多或少是通过一系列触发器实现的。

有一个(我的意思是数千个)中央数据中心和一系列克隆。每次需要写入时,都会向集线器请求锁定,在克隆上执行写入,然后将更改转发到集线器(从而释放锁定)。然后,集线器将数据推送到所有其他克隆。

这意味着读取访问几乎可以保持即时(每个站点实例都有一个本地化克隆)。本地写入访问也会很快。其他一切都可以异步处理,以便两个服务器只需要在写入开始之前传送锁定请求和锁定接收消息,并且可以在用户继续操作后进行推送。

这可能有点无法满足您的需求,但这就是 Google 的做法。

The way Google solved this issue (you get some of the information here. I'm sorry I don't have the link to the actual published paper describing it) is more or less through a series of triggers.

There is one (and by one, I mean thousands) central data hub and a series of clones. Each time a write is needed, a lock is requested of the hub, the write is performed on the clone, which then forwards the change to the hub (thereby releasing the lock). The hub then pushes the data to all of the other clones.

This means that read access can stay almost instant (you have a localized clone per site instance). Local write access will be quick as well. Everything else can be handled asynchronously so that the two servers only need to communicate a lock request and a lock received message before the write starts, and the push can happen after the user has moved on.

This may be a tad much for your needs, but that is how Google does it.

恍梦境° 2024-10-21 19:32:59

首先警告,你想做的事情并不容易;虽然MySQL支持主/从复制,并且您可以在各种级别的层中运行多个主服务器和从服务器,但您真正需要考虑的是“如何从数据库服务器故障中恢复” - 您是否升级从服务器?一致性怎么样(因为它保证从属设备之间的复制失败)?等等,你还需要考虑架构修改;只要您在所有服务器上具有相同的模式,一切都很好,但是一旦您需要推送需要同步数据库更改的代码更新,您就不能依赖已发布到复制的模式更改。

好了,警告结束了,那你要怎么做呢?最简单的方法是启动最新版本的 PhpMyAdmin,它允许您非常快速、轻松地配置复制。在执行此操作之前,请确保在所有 MySql 服务器中都打开了二进制日志记录,因为这将是您的崩溃恢复救星; http://dev.mysql.com/doc/refman/5.0 /en/binary-log.html

将服务器放置在哪里是下一个大问题。如果您的用户在地理上不分散并且您的查询负载较低,您可能可以将它们全部托管在同一个数据仓库中的专用网络后面。在任何情况下,主从复制都会给您带来很大的性能提升,因为所有数据库读取都应该针对从属服务器进行,​​而只针对主服务器执行写入操作。

如果您需要进行地理定位,以便它们不能全部存储在同一个数据仓库中,那么事情就会变得有点困难;您现在需要应对延迟。在这种情况下,由于互联网不是即时的,因此对主设备进行的写入将需要一些时间才能传播到从设备。因此,在写入后不久进行的任何选择查询可能都不会返回新数据,因为它还没有被复制到从属设备。这称为“最终一致性”,一旦您认识到它会发生并且编写了预期的代码,就相对容易克服 - 即永远不要假设数据存在。

我不可能在这个网站上公正地回答你的问题。你最好的选择是读一本书,我强烈推荐这本书;

MySQL 高可用性 - ISBN-13:978-0-596-80730-6

First a warning, what you're trying to do is not easy; whilst MySQL supports master/ slave replication and you can have multiple masters and slaves running in all sorts of levels of tiers, what you really need to think about is "how do I recover from a database server failure" - do you promote a slave? what about consistency (as its guaranteed that the replication failed between the slaves)? etc. You also need to consider schema modifications; every thing is fine and dandy as long as you have the same schema on all servers, but as soon as you need to push a code update that requires a simultaneous database change you can't rely on that schema change having promulgated to the replications.

Okay, warning over, so how do you do it? Easiest way is to fire up the latest version of PhpMyAdmin which allows you to configure replication very quickly and easily. Before you do that, make sure you have binary-logging turned on in all the MySql servers as this will be your crash recovery savior; http://dev.mysql.com/doc/refman/5.0/en/binary-log.html

Where you site your servers is the next big question. If your users aren't geographically dispersed and your query loads are low-ish, you can probably host them all behind a private network in the same data warehouse. The master-slave replication will give you a large performance uplift in any case since all database reads should be made against slaves and only writes performed against the master.

If you need to geo-locate so they can't all be stored in the same data warehouse, then things get a little more difficult; you now have latency to contend with. In this situation, since the internet is not instantaneous, a write made to the master will take time to propagate to the slave. Therefore any select query made very shortly after the write probably won't return the new data, since it won't have been replicated to the slave yet. This is called "eventual consistency" and is relatively easy to overcome once you recongise its going to happen and code to expect it - i.e. never assume the data is present.

I can't possibly answer your question with any real justice on this site. Your best bet is to read a book, I highly recommend this one;

MySQL High Availability - ISBN-13: 978-0-596-80730-6

扛起拖把扫天下 2024-10-21 19:32:59

我对此的快速回答是使用像 Gearman 这样的作业队列系统来卸载同步工作。这样,就不会影响页面加载或用户体验。您只需创建一个 Gearman 作业,它就会将该作业发送到 Gearman 队列并尽可能地获取它。

这似乎也是使用 cron 的更好、即时的解决方案。因为这会立即将作业添加到队列中,并且几乎立即处理它。由于您似乎只想复制选定的数据,因此我不认为 MySQL 复制有多大帮助。

我以前曾使用过 Gearman(甚至使用 PHP),当页面加载不需要等待该作业完成时,它是一个很好的解决方案,可以将工作中断到其他地方来完成。

虽然这可能并不像我想象的那么简单,因为您确实需要设置和学习 Gearman,但它是一个非常方便的工具。

希望这有帮助!

My quick answer to this would be to use a job queue system like Gearman to offload the sync work too. This way, it does not affect page load or user experience. You simply create a Gearman job, and it will send the job to the Gearman Queue and get to it as it can.

This also seems like a much better, instant, solution to using a cron. Because this would instantly add the job to the queue and have it handled almost instantly too. And since you seem to want to replicate only select data, I dont see how MySQL Replication would be of much assistance.

I have worked with Gearman before (even with PHP) and it was a great solution for breaking off work to somewhere else to be completed, when the page load did not need to wait for that job to complete.

Although this might not be simple as I made it seem, since you do need to setup and learn Gearman, but it is a very handy tool.

Hope this helps!

浮华 2024-10-21 19:32:59

我想知道你是否使用 SQL Server 作为后端或其他东西。我非常确定对于 SQL,您可以使用 SQL 复制 http://technet.microsoft. com/en-us/library/ms151198.aspx 来实现预期目标。此时,您的本地应用程序将访问它们自己的 SQL 实例,而每个 SQL 实例将与主数据库服务器“复制”和“同步”其数据。最终结果是您的中央数据库将始终是最新的,并且聚合了来自每个卫星 SQL 服务器的数据。 (尽管请不要在这个问题上引用我的话...我不是 SQL 专家。)

(抱歉,我刚刚意识到您正在使用 PHP/MySQL...并且可能更喜欢开源...但是,我认为这值得研究一下。)

I was wondering whether or not you are using SQL Server as your back-end or something else. I am pretty sure with SQL you can use SQL Replication http://technet.microsoft.com/en-us/library/ms151198.aspx to achieve the desired goal. At that point your local apps would access their own SQL instance, while each sql instance would "replicate" and "sync" its data with the main DB server. The end result is that your central DB will always be up to date and have aggregated data from each and every satelite SQL server. (Though please don't quote me on this one... I am not a SQL expert.)

(Sorry, I just realized you are using PHP/MySQL... and probably favor open source... However, I think this is worth looking into.)

晨光如昨 2024-10-21 19:32:59

我在客户端-服务器 php 应用程序之间进行了一些数据库同步,并使用了以下想法 http://vitana -group.com/article/php/data-synchronization

I have made some database synchronization between client-server php application and used following idea http://vitana-group.com/article/php/data-synchronization

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文