在服务器之间传输快速变化的数据
我的服务器 1 正在生成大量数据,例如,有一些文件不断更新,时间尺度为毫秒。
我想使用 C++ 或标准 Linux 方法将这些文件传输到另一台服务器上。
目前,我一直通过每秒压缩文件并使用 scp 传输它们,然后在其他服务器上解压来做到这一点。
然而,这个延迟非常高,我不能用这个来打破 1 秒以下的时间。
有人可以建议我可以用来以较低延迟移动数据的方法吗?
I have server 1 which is generating a large amount of data, e.g there are files that are constantly being updated, on the time scale of milliseconds.
I would like to get these files onto another server, using C++ or standard Linux methods.
Currently, I have been doing this by compressing the files every second and using scp to transfer them, and unpacking on the other server.
However, the latency of this is very high and I can't break sub 1 second times with this.
Can anybody suggest the methods I can use to move the data with lower latency?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
只是一个想法,我不知道它是否适合您的特定情况:
写两个程序。在更新文件的服务器上运行的一个,它使用 inotify 监视更改。另一个程序在第二个服务器上运行并与第一个服务器保持 tcp 连接。每当第一个程序检测到更改时,它就会将文件的更改部分发送到第二个程序,第二个程序可以将更改应用到它自己的文件副本。
此外,如果第一台服务器实际上并未为这些文件生成数据,而是从新网络中读取数据,则最好将数据流多播到两台服务器。
Just an idea, i don't know if it'll work for your particular situation:
Write two programs. One that runs on the server on which you files are being updated, and it monitors the changes with inotify. The other program runs on the second server and maintains a tcp connection with the first one. Whenever the first program detects a change, it sends the changed part of the file to the second program, which can apply the change to its own copy of the file.
Also, if the first server is not actually generating data for those file, but is reading it from the newtwork, it would be a good idea to just multicast the stream of data to both servers.
在 Linux 上,您可以使用 DRBD 和 GFS2 等集群文件系统在两台服务器之间透明地复制某些分区。
另一种选择是使用 rsync。
On Linux you can use DRBD and a cluster file system like GFS2 to have some partition transparently replicated between the two servers.
Another option, would be to use rsync.
一个 Perl 脚本,它使用 inotify 检测文件系统上的更改,并通过 SSH rsync 重新同步远程副本:
更新:@user788171:回答您的问题:
可能是也可能不是,有太多的未知数:
但是尝试它很便宜,所以我建议你这样做,如果不是的话足够的,然后你可以尝试找出瓶颈并尝试消除它们。
例如,rsync 是一种有通话协议,对网络延迟非常敏感,因此,如果您的文件很小,scp 可能会产生更好的结果。或者,您可以为每个文件保留本地传输的最后版本的本地副本,并仅发送增量。如果CPU是瓶颈,用C++重写它,消除SSH等。
如果无论如何,这种方法变成了死胡同,那么,你仍然可以......
在操作系统级别上做,使用DRDB或其他一些透明的复制机制。您甚至可以尝试使用 FUSE 自己实现。
修改您的主应用程序以写入可以轻松流式传输到另一端的更改日志。
A Perl script that uses inotify to detect changes on the filesystem and rsync over SSH to resynchronize the remote copies:
update: @user788171: in response to your question:
It may be or may be not, there are too many unknowns:
But trying it is cheap so I suggest you do it, if it is not enough, then you can try to identify bottlenecks and try to eliminate them.
For instance,
rsync
is a talky protocol, very sensible to network latency, so, if your files are small,scp
may produce better results. Or you could keep a local copy of the last version transmitted locally for every file and send just deltas. If CPU is the bottleneck, rewrite it in C++, eliminate SSH, etc.And if anyway, that approach turns to be a dead end, then, you can still...
Do it at the OS level, using DRDB or some other transparent replication mechanism. You can even try to implement it yourself using FUSE.
Modify your main application to write a log of the changes that can be streamed easyly to the other side.