最近在学习RavenDB,想将其运用起来。
我想知道人们对以可扩展的方式构建系统有什么建议或建议,特别是跨服务器分片数据,但可以在单个服务器上启动,并且仅根据需要增长。
在单个实例上创建多个数据库并在它们之间实现分片是否可取,甚至可能吗?那么为了扩展,只需将这些数据库分布在机器上就可以了?
我的第一印象是这种方法可行,但我有兴趣听到其他人的意见和经验。
更新1:
我对这个话题有了更多的思考。我认为“稍后解决”方法的问题在于,在这种情况下,在服务器之间均匀分布数据似乎很困难。我不会有一个可以范围的字符串键(AE,FM..),它将用数字完成。
这留下了我可以看到的两个选择。要么在边界处打破它,所以 1-50000 位于分片 1 上,50001-100000 位于分片 2 上,但是对于一个老化的站点(比如这个站点),您的原始分片将执行更少的工作。或者,如果您需要将文档移动到新分片,则循环分片并将分片 ID 放入密钥的策略将会受到影响,它会更改密钥并破坏已使用该密钥的 URL。
因此,我的新想法(我再次将其公开征求意见)是从第一天开始创建一个存储桶系统。其工作原理类似于将分片 id 填充到密钥中,但是您从一个很大的数字开始,例如 1000,然后均匀分配。然后,当需要将负载拆分到分片时,您可以将存储桶 501-1000 移动到新服务器,并编写分片逻辑,即 1-500 转到分片 1,501-1000 转到分片 2。第三台服务器上线,您可以选择另一个范围的存储桶并进行调整。
在我看来,这使您能够分割成与最初创建的存储桶一样多的分片,从而在数量和年龄方面均匀地分散负载。无需更换钥匙。
想法?
I have been learning RavenDB recently and would like to put it to use.
I was wondering what advice or suggestions people had around building the system in a way that is ready to scale, specifically sharding the data across servers, but that can start on a single server and only grow as needed.
Is it advisable, or even possible, to create multiple databases on a single instance and implement sharding across them. Then to scale it would simply be a matter of spreading these databases across the machines?
My first impression is that this approach would work, but I would be interested to hear the opinions and experiences of others.
Update 1:
I have been thinking more on this topic. I think my problem with the "sort it out later" approach is that it seems to me difficult to spread data evenly across servers in that situation. I will not have a string key which I can range on (A-E,F-M..) it will be done with numbers.
This leaves two options I can see. Either break it at boundaries, so 1-50000 is on shard 1, 50001-100000 is on shard 2, but then with a site that ages, say like this one, your original shards will be doing a lot less work. Alternatively a strategy that round robins the shards and put the shard id into the key will suffer if you need to move a document to a new shard, it would change the key and break urls that have used the key.
So my new idea, and again I am putting it out there for comment, would be to create from day one a bucketting system. Which works like stuffing the shard id into the key, but you start with a large number, say 1000 which you distribute evenly between. Then when it comes time to split the load into a shard, you can say move buckets 501-1000 to the new server and write your shard logic that 1-500 goes to shard 1 and 501-1000 goes to shard 2. Then when a third server comes online you pick another range of buckets and adjust.
To my eye this gives you the ability to split into as many shards as you originally created buckets, spreading the load evenly both in terms of quantity and age. Without having to change keys.
Thoughts?
发布评论
评论(2)
这是可能的,但确实没有必要。您可以开始使用一个实例,然后在必要时通过设置分片进行扩展。
另请参阅:
http://ravendb.net/documentation/docs-sharding
http://ayende.com/blog/4830/ravendb-auto-sharding-bundle-design-early-thoughts
http://ravendb.net/documentation/replication/sharding
It is possible, but really unnecessary. You can start using one instance, and then scale when necessary by setting up sharding later.
Also see:
http://ravendb.net/documentation/docs-sharding
http://ayende.com/blog/4830/ravendb-auto-sharding-bundle-design-early-thoughts
http://ravendb.net/documentation/replication/sharding
我认为一个好的解决方案是使用虚拟分片。您可以从一台服务器开始,并将所有虚拟分片指向一台服务器。在增量 id 上使用 module 可将行均匀分布在虚拟分片上。使用 Amazon RDS,您可以选择将从站变为主站,因此在更改分片配置(将更多虚拟分片指向新服务器)之前,您应该将从站设为主站,然后更新配置文件,然后删除新主实例上使用的模数不符合您用于新实例的分片范围的所有记录。
您还需要从原始服务器中删除行,但现在所有具有基于新虚拟分片范围取模的 ID 的新数据都将指向新服务器。因此,您实际上不需要移动数据,而是利用 Amazon RDS 服务器升级功能。
然后,您可以在原始服务器上制作副本。您创建的唯一 ID 为:分片 ID + 表类型 ID + 增量号。因此,当您查询数据库时,您知道要前往哪个分片并从中获取数据。
我不知道如何使用 RavenDB 来做到这一点,但它可以与 Amazon RDS 很好地配合,因为 Amazon 已经为您提供了复制和服务器升级功能。
我同意他们应该是一个从一开始就提供无缝社交性的解决方案,而不是告诉开发人员在问题发生时解决问题。此外,我发现许多跨分片均匀分布数据的 NoSQL 解决方案需要在低延迟的集群中工作。所以你必须考虑到这一点。我尝试过将 Couchbase 与两台不同的 EC2 机器(不在专用的 Amazon 集群中)一起使用,并且数据平衡非常慢。这也增加了总体成本。
我还想补充一点,pinterest 使用 4096 个虚拟分片来解决其可扩展性问题。
您还应该需要研究许多 NoSQL 数据库的分页问题。使用这种方法,您可以非常轻松地对数据进行分页,但可能不是最有效的方式,因为您可能需要查询多个数据库。另一个问题是改变模式。 Pinterest 通过将所有数据放入 MySQL 中的 JSON Blob 中解决了这个问题。当您想要添加新列时,您可以使用新列数据+键创建一个新表,并可以在该列上使用索引。如果您需要查询数据,例如通过电子邮件查询,您可以使用电子邮件 + ID 创建另一个表,并在电子邮件列上放置索引。计数器是另一个问题,我指的是原子计数器。因此,最好从 JSON 中取出这些计数器并将它们放入一列中,以便您可以增加计数器值。
那里有很好的解决方案,但最终您会发现它们可能非常昂贵。我更喜欢花时间构建自己的分片解决方案,并防止自己日后头痛。如果你选择另一条路,有很多公司等着你陷入困境,并索要大量资金来解决你的问题。因为当你需要他们的时候,他们知道你会付出一切来让你的项目再次运转。这是我自己的经验,这就是为什么我绞尽脑汁使用你的方法构建我自己的分片解决方案,这也便宜得多。
另一种选择是使用 MySQL 的中间件解决方案,例如 ScaleBase 或 DBshards。因此,您可以继续使用 MySQL,但在您需要扩展时,他们有经过充分验证的解决方案。而且成本可能比替代方案低得多。
另一个提示:当您为分片创建配置时,请放置一个接受 false 或 true 的 write_lock 属性。因此,当它为 false 时,数据不会写入该分片,因此当您获取特定表类型(即用户)的分片列表时,它将仅写入同一类型的其他分片。这也有利于备份,因此当您在备份所有数据以获取所有分片的时间点快照时想要锁定所有分片时,可以向访问者显示友好的错误。尽管我认为您可以发送一个全局请求,使用 Amazon RDS 为所有数据库创建快照并使用时间点备份。
问题是,大多数公司不会花时间使用 DIY 分片解决方案,他们更愿意为 ScaleBase 付费。这些解决方案来自单个开发人员,他们从一开始就有能力购买可扩展的解决方案,但希望放心,当他们达到需要的程度时,他们就有了解决方案。只要看看那里的价格,你就会发现它会花费你很多钱。完成后我很乐意与您分享我的代码。在我看来,你正在走最好的道路,这完全取决于你的应用程序逻辑。我将我的数据库建模得很简单,没有连接,没有复杂的聚合查询 - 这解决了我的许多问题。未来你可以使用MapReduce来解决那些大数据查询需求。
I think a good solution is to use virtual shards. You can start with one server and point all virtual shard to a single server. Use module on the incremental id to evenly distribute the rows across the virtual shards. With Amazon RDS you have the option to turn a slave into a master, so before you change the sharding configuration (point more virtual shards to the new server), you should make a slave a master, then update your configuration file, and then delete all the records on the new master using modulu that doesn't comply with the shard range that you use for the new instance.
You also need to delete rows from the original server, but by now all the new data with IDs that are modulu based on the new virtual shard ranges will point to the new server. So you actually don't need to move the data, but take advantage of Amazon RDS server promotion feature.
You can then make replica off the original server. You create a unique ID as: Shard ID + Table Type ID + Incremental number. So when you query the database, you know to which shard to go and fetch the data from.
I don't know how it's possible to do it with RavenDB, but it can work pretty well with Amazon RDS, because Amazon already provide you with replication and server promotion feature.
I agree that their should be a solution that right from the start offer seamless sociability and not telling the developer to sort the problems out when those occur. Furthermore, I've find out that many NoSQL solution that evenly distribute data across shards need to work within a cluster with low latency. So you have to take that into consideration. I've tried using Couchbase with two different EC2 machines (not in a dedicated Amazon cluster) and data balancing was very very slow. That adds to the overall cost too.
I also want to add that what pinterest had done to solve their scalability issues, using 4096 virtual shards.
You should also need to look into paging issues with many NoSQL databases. With that approach you can page data quite easily, but maybe not in the most efficient way, because you might need to query several databases. Another problem is changing schema. Pinterest solved this by putting all the data in a JSON Blob in MySQL. When you want to add a new column, you create a new table with the new column data + key, and can use Index on that column. If you need to query the data, for example, by email, you can create another table with the emails + ID and put an index on the email column. Counters are another problem , I mean atomic counters. So it's better taking those counters out from the JSON and put them in a column so you can increment the counter value.
There are great solutions out there, but at the end of the day you find out that they can be very expensive. I preferred spending time on building my own sharding solution and prevent myself the headache later on. If you choose the other path, there are plenty of companies waiting for you to get into trouble and ask for quite a lot of money to solve your problems. Because at the moment that you need them, they know that you will pay everything to make your project work again. That's from my own experience, that's why I am breaking my head to build my own sharding solution using your approach, which also be much cheaper.
Another option is to use middleware solutions for MySQL like ScaleBase or DBshards. So you can continue working with MySQL, but at the time you need to scale, they have well proven solution. And the costs might be much lower then the alternative.
Another tip: when you create your config for shards, put a write_lock attribute that accepts false or true. So when it false, data won't be written to that shard, so when you fetch the list of shards for specific table type (ie. users), it will be written only to the other shards for that same type. This is also good for backup, so you can show a friendly error for visitors when you want to lock all the shard when backing up all the data to get a point-in-time snapshots of all the shards. Although I think you can send a global request for snapshoting all the databases with Amazon RDS and using point-in-time backup.
The thing is that most companies won't spend time working with a DIY sharding solution , they will prefer paying for ScaleBase. Those solution comes from single developers that can afford paying for a scalable solution from the start, but want to rest assured that when they reach to the point they need it, they have a solution. Just look at the prices out there and you can figure out that it will cost you A LOT. I will gladly share my code with you once I'm done. You are going with the best path in my opinion, it's all depends on your application logic. I model my database to be simple, no joins, not complicated aggregation queries - this solves many of my problems. In the future you can use Map Reduce to solve those big data queries needs.