高可用存储

故事和酒 2024-07-12 13:38:07

你可以看看镜像文件系统。它在文件系统级别进行文件复制。
主系统和备份系统上的同一文件都是活动文件。

http://www.linux-ha.org/RelatedTechnologies/Filesystems

回复收藏 0 原文

眼趣 2024-07-12 13:38:07

我建议您访问 F5 网站并查看 http://www.f5.com/解决方案/虚拟化/文件/

回复收藏 0 原文

海未深 2024-07-12 13:38:07

我从你的问题的正文中假设你是商业用户？我从 Silicon Mechanics 购买了一个 6TB RAID 5 单元，并将其连接到 NAS，我的工程师在我们的服务器上安装了 NFS。通过 rsync 备份到另一个大容量 NAS。

回复收藏 0 原文

风苍溪 2024-07-12 13:38:07

你最好的选择也许是与以此类工作为生的专家一起工作。这些人实际上就在我们的办公大楼里……我有机会与他们合作开展我所领导的类似项目。

http://www.deltasquare.com/About

回复收藏 0 原文

半世蒼涼 2024-07-12 13:38:07

查看 Amazon Simple Storage Service (Amazon S3)

http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2?ie=UTF8&node=16427261&否=3435361&me=A36L942TSJ2AJA

--
这可能会引起人们的兴趣。高可用性

尊敬的 AWS 客户：

许多人要求我们提前让您了解当前正在开发的功能和服务，以便您可以更好地规划该功能如何与您的应用程序集成。为此，我们很高兴与您分享有关我们在 AWS 开发的新产品（内容交付服务）的一些早期细节。

这项新服务将为您提供一种向最终用户分发内容的高性能方法，让您的客户在访问您的对象时实现低延迟和高数据传输率。初始版本将帮助需要通过 HTTP 连接提供流行的、公开可读的内容的开发人员和企业。我们的目标是创建一种内容交付服务：

让开发人员和企业轻松入门 - 没有最低费用，也无需承诺。您只需为实际使用的内容付费。
简单易用 - 只需一次简单的 API 调用即可开始交付内容。
与 Amazon S3 无缝协作 - 这为您的文件的原始、最终版本提供了持久存储，同时使内容交付服务更易于使用。
业务遍及全球 - 我们使用三大洲边缘站点的全球网络，从最合适的位置交付您的内容。

首先，您将在 Amazon S3 中存储对象的原始版本，确保它们可供公开读取。然后，您将进行一个简单的 API 调用，以使用新的内容交付服务注册您的存储桶。此 API 调用将返回一个新域名，供您包含在您的网页或应用程序中。当客户端使用此域名请求对象时，它们将被自动路由到最近的边缘位置，以实现内容的高性能交付。就是这么简单。

我们目前正在与一小部分私人测试版客户合作，预计在今年年底之前广泛提供这项服务。如果您希望在我们启动时收到通知，请点击此处告知我们。

此致

亚马逊网络服务团队

Have a look at Amazon Simple Storage Service (Amazon S3)

http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2?ie=UTF8&node=16427261&no=3435361&me=A36L942TSJ2AJA

--
This may be of interest re. High Availability

Dear AWS Customer:

Many of you have asked us to let you know ahead of time about features and services that are currently under development so that you can better plan for how that functionality might integrate with your applications. To that end, we are excited to share some early details with you about a new offering we have under development here at AWS -- a content delivery service.

This new service will provide you a high performance method of distributing content to end users, giving your customers low latency and high data transfer rates when they access your objects. The initial release will help developers and businesses who need to deliver popular, publicly readable content over HTTP connections. Our goal is to create a content delivery service that:

Lets developers and businesses get started easily - there are no minimum fees and no commitments. You will only pay for what you actually use.
Is simple and easy to use - a single, simple API call is all that is needed to get started delivering your content.
Works seamlessly with Amazon S3 - this gives you durable storage for the original, definitive versions of your files while making the content delivery service easier to use.
Has a global presence - we use a global network of edge locations on three continents to deliver your content from the most appropriate location.

You'll start by storing the original version of your objects in Amazon S3, making sure they are publicly readable. Then, you'll make a simple API call to register your bucket with the new content delivery service. This API call will return a new domain name for you to include in your web pages or application. When clients request an object using this domain name, they will be automatically routed to the nearest edge location for high performance delivery of your content. It's that simple.

We're currently working with a small group of private beta customers, and expect to have this service widely available before the end of the year. If you'd like to be notified when we launch, please let us know by clicking here.

Sincerely,

The Amazon Web Services Team

回复收藏 0 原文

奈何桥上唱咆哮 2024-07-12 13:38:07

您在寻找“企业”解决方案还是“家庭”解决方案？从你的问题中很难看出，因为2TB对于企业来说非常小，对于家庭用户（尤其是两台服务器）来说有点高端。您能否澄清一下需求，以便我们讨论权衡？

回复收藏 0 原文

天生の放荡 2024-07-12 13:38:07

有两种方法可以解决这个问题。第一种是从戴尔或惠普购买 SAN 或 NAS，然后花钱解决问题。现代存储硬件使这一切变得容易完成，从而节省您的专业知识来解决更核心的问题。

如果您想自己动手，请查看使用带有 DRBD 的 Linux。

http://www.drbd.org/

DRBD 允许您创建联网块设备。考虑跨两台服务器而不是仅仅两个磁盘的 RAID 1。 DRBD 部署通常使用 Heartbeat 进行故障转移，以防一个系统出现故障。

我不确定负载平衡，但您可能会调查并查看 LVS 是否可用于跨 DRBD 主机进行负载平衡：

http://www.linuxvirtualserver.org/

最后，让我重申一下从长远来看，您只需花钱购买 NAS 就可以节省大量时间。

回复收藏 0 原文

无法言说的痛 2024-07-12 13:38:07

我会推荐 NAS 存储。（网络附加存储）。

HP 有一些不错的存储可供您选择。

http://h18006.www1.hp.com/storage/aiostorage.html

以及集群版本：

http://h18006 .www1.hp.com/storage/software/clusteredfs/index.html?jumpid=reg_R1002_USEN

回复收藏 0 原文

丘比特射中我 2024-07-12 13:38:07

如今，2TB 容量可以装在一台机器中，因此您有从简单到复杂的多种选择。这些都假定 Linux 服务器：

您可以通过设置两台机器并从主机器到备份机器进行定期 rsync 来获得穷人的 HA。
您可以使用 DRBD 在块级别将一个镜像与另一个镜像。这样做的缺点是以后扩展有些困难。
您可以使用 OCFS2 来对磁盘进行集群，以便将来进行扩展。

还有很多商业解决方案，但目前 2TB 对于大多数解决方案来说都有点小。

您还没有提到您的应用程序，但如果不需要热故障转移，并且您真正想要的只是能够承受丢失一两个磁盘的情况，请找到支持 RAID-5 的 NAS，至少 4 个驱动器，和热插拔，你应该可以开始了。

回复收藏 0 原文

滿滿的愛 2024-07-12 13:38:07

我最近使用 DRBD 作为后端部署了 hanfs，在我的情况下，我正在运行主动/备用模式，但我也在主/主模式下使用 OCFS2 成功测试了它。不幸的是，关于如何最好地实现这一目标的文档并不多，大多数现有的文档充其量也没什么用。如果您确实选择了 drbd 路线，我强烈建议您加入 drbd 邮件列表，并阅读所有文档。这是我为处理 ha 故障而编写的 ha/drbd 设置和脚本：

需要 DRBD8 - 这是由 drbd8-utils 和 drbd8-source 提供的。一旦安装了这些（我相信它们是由向后移植提供的），您可以使用 module-assistant 来安装它 - ma ai drbd8。此时可以使用 depmod -a 或重新启动，如果使用 depmod -a，则需要 modprobe drbd。

您需要一个后端分区来用于 drbd，不要将此分区设置为 LVM，否则您会遇到各种问题。不要将 LVM 放在 drbd 设备上，否则您会遇到各种问题。

Hanfs1：


/etc/drbd.conf

global {
        usage-count no;
}
common {
        protocol C;
        disk { on-io-error detach; }
}
resource export {
        syncer {
                rate 125M;
        }
        on hanfs2 {
                address         172.20.1.218:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
        }
        on hanfs1 {
                address         172.20.1.219:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
       }
}

Hanfs2的/etc/drbd.conf：


global {
        usage-count no;
}
common {
        protocol C;
        disk { on-io-error detach; }
}
resource export {
        syncer {
                rate 125M;
        }
        on hanfs2 {
                address         172.20.1.218:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
        }
        on hanfs1 {
                address         172.20.1.219:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
       }
}

配置完成后，我们接下来需要启动 drbd。

drbdadm create-md export
drbdadm attach export
drbdadm connect export

我们现在必须执行数据的初始同步 - 显然，如果这是一个全新的 drbd 集群，那么您选择哪个节点并不重要。

完成后，您需要在 drbd 设备上 mkfs.yourchoiceofilesystem - 我们上面配置中的设备是 /dev/drbd1。 http://www.drbd.org/users-guide/p-work.html 是使用 drbd 时值得阅读的有用文档。

心跳

安装 heartbeat2。（非常简单，apt-get install heartbeat2）。

每台机器上的 /etc/ha.d/ha.cf 应包含：

hanfs1:


logfacility local0
keepalive 2
warntime 10
deadtime 30
initdead 120

 ucast eth1 172.20.1.218

 auto_failback 无

节点 hanfs1 
  节点 hanfs2

汉夫斯2：


logfacility local0
keepalive 2
warntime 10
deadtime 30
initdead 120

  ucast eth1 172.20.1.219

 auto_failback 无

节点 hanfs1 
  节点 hanfs2

/etc/ha.d/haresources 在两个 ha 盒子上应该是相同的：

hanfs1 IPaddr::172.20.1.230/24/eth1
hanfs1  HeartBeatWrapper

我编写了一个包装脚本来处理故障转移场景中由 nfs 和 drbd 引起的特性。该脚本应存在于每台计算机上的 /etc/ha.d/resources.d/ 中。

!/bin/bash 心跳严重失败。所以这是一个解决这个愚蠢问题的包装器，我只是包装心跳脚本，除了在它们工作时卸载的情况外，大多数情况下 if [[ -e /tmp/heartbeatwrapper ]]; 然后 runningpid=$(cat /tmp/heartbeatwrapper) if [[ -z $(ps --no-heading -p $runningpid) ]]; 然后 echo“找到 PID，但进程似乎已死。继续。” 否则 echo“找到PID，进程处于活动状态，正在退出。” 7号出口无线 fi 回声$> /tmp/heartbeatwrapper if [[ x$1 == "xstop" ]]; 然后 /etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1 NFS 初始化脚本与 LSB 不兼容，无论发生什么情况，退出代码都是 0。谢谢你们，你们这些废话真的让我很开心。由于上述原因，我们只希望 nfs 确实捕获到退出信号，并设法关闭其连接。如果没有，我们稍后会杀死它，然后再命名任何其他 nfs 内容。我发现这是一个有趣的见解，让我们了解 NFS 的编写有多糟糕。睡眠 1 #we don't want to shutdown nfs first! #The lock files might go away, which would be bad. #The above seems to not matter much, the only thing I've determined #is that if you have anything mounted synchronously, it's going to break #no matter what I do. Basically, sync == screwed; in NFSv3 terms. #End result of failing over while a client that's synchronous is that #the client hangs waiting for its nfs server to come back - thing doesn't #even bother to time out, or attempt a reconnect. #async works as expected - it insta-reconnects as soon as a connection seems #to be unstable, and continues to write data. In all tests, md5sums have #remained the same with/without failover during transfer. #So, we first unmount /export - this prevents drbd from having a shit-fit #when we attempt to turn this node secondary. #That's a lie too, to some degree. LVM is entirely to blame for why DRBD #was refusing to unmount. Don't get me wrong, having /export mounted doesn't #help either, but still. #fix a usecase where one or other are unmounted already, which causes us to terminate early. if [[ "$(grep -o /varlibnfs/rpc_pipefs /etc/mtab)" ]]; then for ((test=1; test <= 10; test++)); do umount /export/varlibnfs/rpc_pipefs >/dev/null 2>&1 if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then break fi if [[ $? -ne 0 ]]; then #try again, harder this time umount -l /var/lib/nfs/rpc_pipefs >/dev/null 2>&1 if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then break fi fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper echo "Problem unmounting rpc_pipefs" exit 1 fi fi if [[ "$(grep -o /dev/drbd1 /etc/mtab)" ]]; then for ((test=1; test <= 10; test++)); do umount /export >/dev/null 2>&1 if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then break fi if [[ $? -ne 0 ]]; then #try again, harder this time umount -l /export >/dev/null 2>&1 if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then break fi fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper echo "Problem unmount /export" exit 1 fi fi #now, it's important that we shut down nfs. it can't write to /export anymore, so that's fine. #if we leave it running at this point, then drbd will screwup when trying to go to secondary. #See contradictory comment above for why this doesn't matter anymore. These comments are left in #entirely to remind me of the pain this caused me to resolve. A bit like why churches have Jesus #nailed onto a cross instead of chilling in a hammock. pidof nfsd | xargs kill -9 >/dev/null 2>&1 sleep 1 if [[ -n $(ps aux | grep nfs | grep -v grep) ]]; then echo "nfs still running, trying to kill again" pidof nfsd | xargs kill -9 >/dev/null 2>&1 fi sleep 1 /etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1 sleep 1 #next we need to tear down drbd - easy with the heartbeat scripts #it takes input as resourcename start|stop|status #First, we'll check to see if it's stopped /etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1 if [[ $? -eq 2 ]]; then echo "resource is already stopped for some reason..." else for ((i=1; i <= 10; i++)); do /etc/ha.d/resource.d/drbddisk export stop >/dev/null 2>&1 if [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Secondary" ]] || [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Unknown" ]]; then echo "Successfully stopped DRBD" break else echo "Failed to stop drbd for some reason" cat /proc/drbd if [[ $i -eq 10 ]]; then exit 50 fi fi done fi rm -f /tmp/heartbeatwrapper exit 0 elif [[ x$1 == "xstart" ]]; 然后 #start up drbd first /etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "Something seems to have broken. Let's check possibilities..." testvar=$(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) if [[ $testvar == "Primary/Unknown" ]] || [[ $testvar == "Primary/Secondary" ]] then echo "All is fine, we are already the Primary for some reason" elif [[ $testvar == "Secondary/Unknown" ]] || [[ $testvar == "Secondary/Secondary" ]] then echo "Trying to assume Primary again" /etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "I give up, something's seriously broken here, and I can't help you to fix it." rm -f /tmp/heartbeatwrapper exit 127 fi fi fi sleep 1 #now we remount our partitions for ((test=1; test <= 10; test++)); do mount /dev/drbd1 /export >/tmp/mountoutput if [[ -n $(grep -o export /etc/mtab) ]]; then break fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper exit 125 fi #I'm really unsure at this point of the side-effects of not having rpc_pipefs mounted. #The issue here, is that it cannot be mounted without nfs running, and we don't really want to start #nfs up at this point, lest it ruin everything. #For now, I'm leaving mine unmounted, it doesn't seem to cause any problems. #Now we start up nfs. /etc/init.d/nfs-kernel-server start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "There's not really that much that I can do to debug nfs issues." echo "probably your configuration is broken. I'm terminating here." rm -f /tmp/heartbeatwrapper exit 129 fi #And that's it, done. rm -f /tmp/heartbeatwrapper exit 0 elif [[ "x$1" == "xstatus" ]]; 然后 #Lets check to make sure nothing is broken. #DRBD first /etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi #mounted? grep -q drbd /etc/mtab >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi #nfs running? /etc/init.d/nfs-kernel-server status >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi echo "running" rm -f /tmp/heartbeatwrapper exit 0 fi

完成上述所有操作后，您只需配置 /etc/exports

/export 172.20.1.0/255.255.255.0(rw,sync,fsid=1,no_root_squash)

然后只需在两台计算机上启动 heartbeat 并在其中一台计算机上发出 hb_takeover 即可。您可以通过确保您发出接管的设备是主设备来测试它是否正常工作 - 检查 /proc/drbd、该设备是否已正确安装以及您可以访问 nfs。

——

祝你好运。对我来说，从头开始建立它是一次极其痛苦的经历。

I've recently deployed hanfs using DRBD as the backend, in my situation, I'm running active/standby mode, but I've tested it successfully using OCFS2 in primary/primary mode too. There unfortunately isn't much documentation out there on how best to achieve this, most that exists is barely useful at best. If you do go along the drbd route, I highly recommend joining the drbd mailing list, and reading all of the documentation. Here's my ha/drbd setup and script I wrote to handle ha's failures:

DRBD8 is required - this is provided by drbd8-utils and drbd8-source. Once these are installed (I believe they're provided by backports), you can use module-assistant to install it - m-a a-i drbd8. Either depmod -a or reboot at this point, if you depmod -a, you'll need to modprobe drbd.

You'll require a backend partition to use for drbd, do not make this partition LVM, or you'll hit all sorts of problems. Do not put LVM on the drbd device or you'll hit all sorts of problems.

Hanfs1:


/etc/drbd.conf

global {
        usage-count no;
}
common {
        protocol C;
        disk { on-io-error detach; }
}
resource export {
        syncer {
                rate 125M;
        }
        on hanfs2 {
                address         172.20.1.218:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
        }
        on hanfs1 {
                address         172.20.1.219:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
       }
}

Hanfs2's /etc/drbd.conf:


global {
        usage-count no;
}
common {
        protocol C;
        disk { on-io-error detach; }
}
resource export {
        syncer {
                rate 125M;
        }
        on hanfs2 {
                address         172.20.1.218:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
        }
        on hanfs1 {
                address         172.20.1.219:7789;
                device          /dev/drbd1;
                disk            /dev/sda3;
                meta-disk       internal;
       }
}

Once configured, we need to bring up drbd next.

drbdadm create-md export
drbdadm attach export
drbdadm connect export

We must now perform an initial synchronization of data - obviously, if this is a brand new drbd cluster, it doesn't matter which node you choose.

Once done, you'll need to mkfs.yourchoiceoffilesystem on your drbd device - the device in our config above is /dev/drbd1. http://www.drbd.org/users-guide/p-work.html is a useful document to read while working with drbd.

Heartbeat

Install heartbeat2. (Pretty simple, apt-get install heartbeat2).

/etc/ha.d/ha.cf on each machine should consist of:

hanfs1:


logfacility local0
keepalive 2
warntime 10
deadtime 30
initdead 120

ucast eth1 172.20.1.218

auto_failback no

node hanfs1
node hanfs2

hanfs2:


logfacility local0
keepalive 2
warntime 10
deadtime 30
initdead 120

ucast eth1 172.20.1.219

auto_failback no

node hanfs1
node hanfs2

/etc/ha.d/haresources should be the same on both ha boxes:

hanfs1 IPaddr::172.20.1.230/24/eth1
hanfs1  HeartBeatWrapper

I wrote a wrapper script to deal with the idiosyncracies caused by nfs and drbd in a failover scenario. This script should exist within /etc/ha.d/resources.d/ on each machine.

!/bin/bash heartbeat fails hard. so this is a wrapper to get around that stupidity I'm just wrapping the heartbeat scripts, except for in the case of umount as they work, mostly if [[ -e /tmp/heartbeatwrapper ]]; then runningpid=$(cat /tmp/heartbeatwrapper) if [[ -z $(ps --no-heading -p $runningpid) ]]; then echo "PID found, but process seems dead. Continuing." else echo "PID found, process is alive, exiting." exit 7 fi fi echo $ > /tmp/heartbeatwrapper if [[ x$1 == "xstop" ]]; then /etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1 NFS init script isn't LSB compatible, exit codes are 0 no matter what happens. Thanks guys, you really make my day with this bullshit. Because of the above, we just have to hope that nfs actually catches the signal to exit, and manages to shut down its connections. If it doesn't, we'll kill it later, then term any other nfs stuff afterwards. I found this to be an interesting insight into just how badly NFS is written. sleep 1 #we don't want to shutdown nfs first! #The lock files might go away, which would be bad. #The above seems to not matter much, the only thing I've determined #is that if you have anything mounted synchronously, it's going to break #no matter what I do. Basically, sync == screwed; in NFSv3 terms. #End result of failing over while a client that's synchronous is that #the client hangs waiting for its nfs server to come back - thing doesn't #even bother to time out, or attempt a reconnect. #async works as expected - it insta-reconnects as soon as a connection seems #to be unstable, and continues to write data. In all tests, md5sums have #remained the same with/without failover during transfer. #So, we first unmount /export - this prevents drbd from having a shit-fit #when we attempt to turn this node secondary. #That's a lie too, to some degree. LVM is entirely to blame for why DRBD #was refusing to unmount. Don't get me wrong, having /export mounted doesn't #help either, but still. #fix a usecase where one or other are unmounted already, which causes us to terminate early. if [[ "$(grep -o /varlibnfs/rpc_pipefs /etc/mtab)" ]]; then for ((test=1; test <= 10; test++)); do umount /export/varlibnfs/rpc_pipefs >/dev/null 2>&1 if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then break fi if [[ $? -ne 0 ]]; then #try again, harder this time umount -l /var/lib/nfs/rpc_pipefs >/dev/null 2>&1 if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then break fi fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper echo "Problem unmounting rpc_pipefs" exit 1 fi fi if [[ "$(grep -o /dev/drbd1 /etc/mtab)" ]]; then for ((test=1; test <= 10; test++)); do umount /export >/dev/null 2>&1 if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then break fi if [[ $? -ne 0 ]]; then #try again, harder this time umount -l /export >/dev/null 2>&1 if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then break fi fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper echo "Problem unmount /export" exit 1 fi fi #now, it's important that we shut down nfs. it can't write to /export anymore, so that's fine. #if we leave it running at this point, then drbd will screwup when trying to go to secondary. #See contradictory comment above for why this doesn't matter anymore. These comments are left in #entirely to remind me of the pain this caused me to resolve. A bit like why churches have Jesus #nailed onto a cross instead of chilling in a hammock. pidof nfsd | xargs kill -9 >/dev/null 2>&1 sleep 1 if [[ -n $(ps aux | grep nfs | grep -v grep) ]]; then echo "nfs still running, trying to kill again" pidof nfsd | xargs kill -9 >/dev/null 2>&1 fi sleep 1 /etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1 sleep 1 #next we need to tear down drbd - easy with the heartbeat scripts #it takes input as resourcename start|stop|status #First, we'll check to see if it's stopped /etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1 if [[ $? -eq 2 ]]; then echo "resource is already stopped for some reason..." else for ((i=1; i <= 10; i++)); do /etc/ha.d/resource.d/drbddisk export stop >/dev/null 2>&1 if [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Secondary" ]] || [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Unknown" ]]; then echo "Successfully stopped DRBD" break else echo "Failed to stop drbd for some reason" cat /proc/drbd if [[ $i -eq 10 ]]; then exit 50 fi fi done fi rm -f /tmp/heartbeatwrapper exit 0 elif [[ x$1 == "xstart" ]]; then #start up drbd first /etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "Something seems to have broken. Let's check possibilities..." testvar=$(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) if [[ $testvar == "Primary/Unknown" ]] || [[ $testvar == "Primary/Secondary" ]] then echo "All is fine, we are already the Primary for some reason" elif [[ $testvar == "Secondary/Unknown" ]] || [[ $testvar == "Secondary/Secondary" ]] then echo "Trying to assume Primary again" /etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "I give up, something's seriously broken here, and I can't help you to fix it." rm -f /tmp/heartbeatwrapper exit 127 fi fi fi sleep 1 #now we remount our partitions for ((test=1; test <= 10; test++)); do mount /dev/drbd1 /export >/tmp/mountoutput if [[ -n $(grep -o export /etc/mtab) ]]; then break fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper exit 125 fi #I'm really unsure at this point of the side-effects of not having rpc_pipefs mounted. #The issue here, is that it cannot be mounted without nfs running, and we don't really want to start #nfs up at this point, lest it ruin everything. #For now, I'm leaving mine unmounted, it doesn't seem to cause any problems. #Now we start up nfs. /etc/init.d/nfs-kernel-server start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "There's not really that much that I can do to debug nfs issues." echo "probably your configuration is broken. I'm terminating here." rm -f /tmp/heartbeatwrapper exit 129 fi #And that's it, done. rm -f /tmp/heartbeatwrapper exit 0 elif [[ "x$1" == "xstatus" ]]; then #Lets check to make sure nothing is broken. #DRBD first /etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi #mounted? grep -q drbd /etc/mtab >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi #nfs running? /etc/init.d/nfs-kernel-server status >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi echo "running" rm -f /tmp/heartbeatwrapper exit 0

fi

With all of the above done, you'll then just want to configure /etc/exports

/export 172.20.1.0/255.255.255.0(rw,sync,fsid=1,no_root_squash)

Then it's just a case of starting up heartbeat on both machines and issuing hb_takeover on one of them. You can test that it's working by making sure the one you issued the takeover on is primary - check /proc/drbd, that the device is mounted correctly, and that you can access nfs.

--

Best of luck man. Setting it up from the ground up was, for me, an extremely painful experience.

回复收藏 0 原文

高可用存储

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（10）

!/bin/bash

心跳严重失败。

所以这是一个

解决这个愚蠢问题的

包装器，我只是包装心跳脚本，除了在

它们工作时卸载的情况外，大多数情况下

NFS 初始化脚本与 LSB 不兼容，无论发生什么情况，退出代码都是 0。

谢谢你们，你们这些废话真的让我很开心。

由于上述原因，我们只希望 nfs 确实捕获到

退出信号，并设法关闭其连接。

如果没有，我们稍后会杀死它，然后再命名任何其他 nfs 内容。

我发现这是一个有趣的见解，让我们了解 NFS 的编写有多糟糕。

!/bin/bash

heartbeat fails hard.

so this is a wrapper

to get around that stupidity

I'm just wrapping the heartbeat scripts, except for in the case of umount

as they work, mostly

NFS init script isn't LSB compatible, exit codes are 0 no matter what happens.

Thanks guys, you really make my day with this bullshit.

Because of the above, we just have to hope that nfs actually catches the signal

to exit, and manages to shut down its connections.

If it doesn't, we'll kill it later, then term any other nfs stuff afterwards.

I found this to be an interesting insight into just how badly NFS is written.

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

高可用存储

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（10）

!/bin/bash

心跳严重失败。

所以这是一个

解决这个愚蠢问题的

包装器，我只是包装心跳脚本，除了在

它们工作时卸载的情况外，大多数情况下

NFS 初始化脚本与 LSB 不兼容，无论发生什么情况，退出代码都是 0。

谢谢你们，你们这些废话真的让我很开心。

由于上述原因，我们只希望 nfs 确实捕获到

退出信号，并设法关闭其连接。

如果没有，我们稍后会杀死它，然后再命名任何其他 nfs 内容。

我发现这是一个有趣的见解，让我们了解 NFS 的编写有多糟糕。

!/bin/bash

heartbeat fails hard.

so this is a wrapper

to get around that stupidity

I'm just wrapping the heartbeat scripts, except for in the case of umount

as they work, mostly

NFS init script isn't LSB compatible, exit codes are 0 no matter what happens.

Thanks guys, you really make my day with this bullshit.

Because of the above, we just have to hope that nfs actually catches the signal

to exit, and manages to shut down its connections.

If it doesn't, we'll kill it later, then term any other nfs stuff afterwards.

I found this to be an interesting insight into just how badly NFS is written.

关于作者

相关话题

热门标签

推荐作者

qq_FjTq5B

18273202778

WordPress小学生

〃温暖了心ぐ

迷乱花海

niuniu

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。