cassandra从一个节点备份到另一个节点
我是 cassandra 和 gremlin 的新手。我正在使用 gremlin 输入和检索来自 cassandra 的数据。我想进行备份并在新节点上恢复它。我使用 nodetool 拍摄了快照。我还使用 elasticsearch 进行索引。请帮我提供一些链接或文档
I'm new to cassandra and gremlin.i am using gremlin to enter and retrive the data from cassandra .i want to take a bakup and restore it on new node.i took a snapshot using nodetool.i am also using elasticsearch for indexing.please help me with some links or documents
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我使用了这篇文章的第二种方法: 如何为其他 Cassandra 的远程节点复制 Cassandra 的本地节点?
如果表的结构相同,您可以创建两个 bash 脚本,如下所示:
1.使用以下命令导出数据:
2.导入数据:
如果您注意到一些缓慢的过程,请查看另一篇文章:Cassandra 的 sstableloader 导入数据太慢
重要提示:您应该根据您的实际情况调整此信息操作。
I used the secound approach of this post : How do I replicate a Cassandra's local node for other Cassandra's remote node?
If structure of the tables is the same, you could create two bash's scripts like below:
1. Export the data using these commands:
2. Import the data:
If you note some slow process, please check this another post: Cassandra's sstableloader too slow in import data
Important: You should adapt this informaction to your reallity.
我按照以下步骤操作,恢复工作
用于备份
转到路径 cd /var/lib/cassandra/data,
然后使用下面的命令拍摄快照
nodetool snapshot janusgraph -cf edgestore -t edgestore_mar6
nodetool snapshot janusgraph -cf graphindex -t graphindex_mar6
备份了 /var/lib/cassandra/data 下 janusgraph 目录中存在的所有文件夹
我现在 移动到文件夹 cd /var/lib/cassandra/data/janusgraph 并输入命令 ls -lrth。
最新的文件夹将显示在底部,然后转到这些文件夹并进入这些文件夹内的快照文件夹。
例如
cd /var/lib/cassandra/data/janusgraph/graphindex-8e147200236f11edbecf211c2dd12670/snapshots
并将 graphindex_mar6 复制到一个新的目录中
我对 keyspace(目录)janusgraph 下的所有其他文件夹重复了它,复制了所有将具有今天日期的文件夹复制到一个新目录并使用 tar 命令我压缩了新目录
tar cvzf janusgraph_mar6.tar.gz janusgraph
这里janusgraph是我创建的目录,并复制了keyspace(目录)janusgraph下所有文件夹的所有快照。
用于恢复
然后将janusgraph_mar6.tar.gz文件夹复制到我们要恢复数据的远程计算机上
解压janusgraph文件夹
tar xvzf janusgraph_mar6.tar.gz
然后在文件夹janusgraph,将其他文件夹重命名,
例如edgestore_mar6 重命名为edgestore
mv edgestore_mar6 edgestore
graphindex_mar6 重命名为graphindex
mv graphindex_mar6 graphindex
对所有文件夹重复此操作
,然后使用命令进行恢复
sstableloader -d cassandra-ip /home/ubuntu/janusgraph/graphindex/
sstableloader -d cassandra-ip /home/ubuntu/janusgraph/edgestore/
这里我们可以通过运行命令nodetool status来获取cassandra-ip,对所有其他文件夹使用上述命令,然后重新启动cassandra
sudo service restart cassandra
由于我在后端使用elasticsearch进行索引,我的数据已恢复
,恢复后我在gremlin控制台上运行了重新索引脚本
i followed the below steps and the restoration worked
for backup
go to the path cd /var/lib/cassandra/data
then take the snapshot using the command below
nodetool snapshot janusgraph -cf edgestore -t edgestore_mar6
nodetool snapshot janusgraph -cf graphindex -t graphindex_mar6
i took the backup of all the folders present in the directory janusgraph under /var/lib/cassandra/data
now move to the folder cd /var/lib/cassandra/data/janusgraph and type give the command ls -lrth.
the latest folders will be present at the bottom then go to those folders and go inside the snapshot folders present inside those folders.
eg
cd /var/lib/cassandra/data/janusgraph/graphindex-8e147200236f11edbecf211c2dd12670/snapshots
and copied that graphindex_mar6 to a new diretory
i repeated it for all the others folders under keyspace(directory) janusgraph,copied all the folders with today's date to a new directory and using the tar command i compressed the new directory
tar cvzf janusgraph_mar6.tar.gz janusgraph
here janusgraph is the directory i created and copied all the snapshots of all the folders under keyspace(directory) janusgraph.
for restoring
then copy the janusgraph_mar6.tar.gz folder to the remote machine,where we want to restore the data
uncompress the janusgraph folder
tar xvzf janusgraph_mar6.tar.gz
then under the folder janusgraph ,rename the other folders
eg edgestore_mar6 to edgestore
mv edgestore_mar6 edgestore
graphindex_mar6 to graphindex
mv graphindex_mar6 graphindex
repeat for all the folders
then restore using the command
sstableloader -d cassandra-ip /home/ubuntu/janusgraph/graphindex/
sstableloader -d cassandra-ip /home/ubuntu/janusgraph/edgestore/
here we can get the cassandra-ip by running the command nodetool status,use the above commands for all the other folders and then restart cassandra
sudo service restart cassandra
my data was restored
since i used elasticsearch for indexing in my backend ,i ran the my reindexing script on gremlin console after restoration