MongoDB 的插入安全模式有多安全?

发布于 2024-11-28 23:49:56 字数 351 浏览 1 评论 0原文

我正在开发一个项目,其中包含一些重要数据。这意味着如果灯或服务器出现故障,我们不会丢失任何数据。我们使用 MongoDB 作为数据库。我想确保插入后我的数据位于数据库中,如果未插入一个元素,则回滚整个批次。我知道 Mongo 背后的理念是我们不需要事务,但是我如何确保我的数据在插入后真正安全地存储而不是发送到某个“黑洞”。

  • 我应该进行搜索吗?

  • 我应该使用一些特定的 mongoDB 命令吗?

  • 即使一台服务器足以满足需求,我是否应该使用分片
    速度,顺便说一下,如果光线充足,它不能保证任何东西

最好的解决方案是什么?

I am working on a project which has some important data in it. This means we cannot to lose any of it if the light or server goes down. We are using MongoDB for the database. I'd like to be sure that my data is in the database after the insert and rollback the whole batch if one element was not inserted. I know it is the philosophy behind Mongo that we do not need transactions but how can I make sure that my data is really safely stored after insert rather than sent to some "black hole".

  • Should I make a search?

  • Should I use some specific mongoDB commands?

  • Should I use sharding even if one server is enough for satisfying
    the speed and by the way it doesn't guarantee anything if the light
    goes down?

What is the best solution?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦纸 2024-12-05 23:49:56

最好的选择是使用 Write Concerns - 这些可以让您告诉 MongoDB 一条数据的重要性。最快的写入关注也是最不安全的 - 数据不会刷新到磁盘,直到下一次计划刷新。最安全的方法是在返回之前确认数据已写入多台机器上的磁盘。

您正在寻找的写入问题是 FSYNC_SAFE (至少从 Java 驱动程序) 或 REPLICAS_SAFE 确认您的数据已被复制。

请记住,MongoDB 没有传统意义上的事务 - 您必须手动进行回滚,因为您无法告诉 Mongo 数据库为您执行此操作。

您需要做的另一件事是使用相对较新的 --journal 选项(它使用预写日志),或者使用副本集在多台机器上共享数据,以最大限度地提高数据完整性发生碰撞/断电时。

分片与其说是一种针对硬件故障的保护,不如说是一种在处理特别大的数据集时共享负载的方法 - 分片不应与副本集混淆,副本集是一种将数据写入多台计算机上的多个磁盘的方法。

因此,如果您的数据足够有价值,您绝对应该使用副本集,甚至可能将从站放置在其他数据中心/可用区/机架/等中,以提供您所需的弹性。

现在/将会有(不记得是否已实现)一种指定副本集中各个节点的优先级的方法,这样,如果主节点宕机,则选出的新主节点是相同数据中的节点如果有这样的机器可用,则中心(即阻止国家另一边的奴隶成为主人,除非它确实是唯一的其他选择)。

Your best bet is to use Write Concerns - these allow you to tell MongoDB how important a piece of data is. The quickest Write Concern is also the least safe - the data is not flushed to disk until the next scheduled flush. The safest will confirm that the data has been written to disk on a number of machines before returning.

The write concern you are looking for is FSYNC_SAFE (at least that is what it is called from the point of view of the Java driver) or REPLICAS_SAFE which confirms that your data has been replicated.

Bear in mind that MongoDB does not have transactions in the traditional sense - your rollback will have to be rolled by hand as you can't tell the Mongo database to do this for you.

The other thing you need to do is either use the relatively new --journal option (which uses a Write Ahead Log), or use replica sets to share your data across many machines in order to maximise data integrity in the event of a crash/power loss.

Sharding is not so much a protection against hardware failure as a method for sharing the load when dealing with particularly large datasets - sharding shouldn't be confused with replica sets which is a way of writing data to more than one disk on more than one machine.

Therefore, if your data is valuable enough, you should definitely be using replica sets, perhaps even siting slaves in other data centres/availability zones/racks/etc in order to provide the resilience you require.

There is/will be (can't remember offhand whether this has been implemented yet) a way to specify the priority of individual nodes in a replica set such that if the master goes down the new master that is elected is one in the same data centre if such a machine is available (ie to stop a slave on the other side of the country from becoming master unless it really is the only other option).

我从 Google 群组中一个名为 GVP 的人那里收到了非常好的答案。我会引用它(基本上它与里奇的答案相加):

我想确保我的数据在
如果没有插入一个元素,则插入并回滚整个批次。

这是一个复杂的主题,您必须做出一些权衡
考虑这里。

我应该使用分片吗?

分片用于缩放写入。为了数据安全,您需要查看
副本集。

我应该使用一些特定的 mongoDB 命令吗?

首先要考虑的是“安全”模式或“getLastError()”
安德烈亚斯指出。如果您发出“安全”写入,您就知道
数据库已收到插入并应用写入。然而,
MongoDB 每 60 秒才刷新到磁盘,因此服务器可能会失败
磁盘上没有数据。

第二件事要考虑的是“日记”
(v1.8+)。打开日志功能后,数据将刷新到日志中
每 100 毫秒。因此,您在失败之前的时间窗口会更短。这
驱动程序有一个“fsync”选项(检查该名称),该选项会进行一步
除了“安全”之外,它还等待数据已被确认
刷新到磁盘(即日志文件)。然而,这仅
覆盖一台服务器。如果服务器上的硬盘只是
死了?那么您需要第二份副本。

要考虑的第三件事是
复制。驱动程序支持“W”参数,表示“复制
将此数据写入N个节点”然后返回。如果写入未到达
在某个超时之前有“N”个节点,则写入失败(异常
被抛出)。但是,您必须根据以下情况正确配置“W”
副本集中的节点数。再次,因为硬盘
即使使用日志记录,也可能会失败,您将需要查看复制。
然后是跨数据中心的复制,时间太长了
进入这里。最后要考虑的是你的要求“滚动
据我了解,MongoDB没有这个“回滚”
容量。如果你正在进行批量插入,你会得到的最好的结果是
指示哪些元素失败。

这里是 PHP 驱动程序的链接:http://it .php.net/manual/en/mongocollection.batchinsert.php 您必须检查有关复制和 W 参数的详细信息。我相信同样的限制也适用于此。

I received a really nice answer from a person called GVP on google groups. I will quote it(basically it adds up to Rich's answer):

I'd like to be sure that my data is in the database after the
insert and rollback the whole batch if one element was not inserted.

This is a complex topic and there are several trade-offs you have to
consider here.

Should I use sharding?

Sharding is for scaling writes. For data safety, you want to look a
replica sets.

Should I use some specific mongoDB commands?

First thing to consider is "safe" mode or "getLastError()" as
indicated by Andreas. If you issue a "safe" write, you know that the
database has received the insert and applied the write. However,
MongoDB only flushes to disk every 60 seconds, so the server can fail
without the data on disk.

Second thing to consider is "journaling"
(v1.8+). With journaling turned on, data is flushed to the journal
every 100ms. So you have a smaller window of time before failure. The
drivers have an "fsync" option (check that name) that goes one step
further than "safe", it waits for acknowledgement that the data has
be flushed to the disk (i.e. the journal file). However, this only
covers one server. What happens if the hard drive on the server just
dies? Well you need a second copy.

Third thing to consider is
replication. The drivers support a "W" parameter that says "replicate
this data to N nodes" before returning. If the write does not reach
"N" nodes before a certain timeout, then the write fails (exception
is thrown). However, you have to configure "W" correctly based on the
number of nodes in your replica set. Again, because a hard drive
could fail, even with journaling, you'll want to look at replication.
Then there's replication across data centers which is too long to get
into here. The last thing to consider is your requirement to "roll
back". From my understanding, MongoDB does not have this "roll back"
capacity. If you're doing a batch insert the best you'll get is an
indication of which elements failed.

Here's a link to the PHP driver on this one: http://it.php.net/manual/en/mongocollection.batchinsert.php You'll have to check the details on replication and the W parameter. I believe the same limitations apply here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文