首先,介绍一下背景。 分布式版本控制系统 (DVCS) 有许多不同的比较,它们比较存储库的大小或基准操作速度。 除了测量涉及“克隆”、“拉”/“取”或“推”等网络的操作(命令)速度之外,我还没有找到任何可以对各种 DVCS 和使用的各种协议的网络性能进行基准测试的方法。
我想知道你会如何进行这样的比较? 如何衡量应用程序的网络性能,或如何对网络协议进行基准测试。 我设想这里还测量性能对网络带宽和网络延迟(ping 时间)的依赖性; 某些协议以更多往返交换(协商)的形式牺牲延迟,以发送所需的最少最终“包”。
如果可能的话,我更喜欢仅涉及一台计算机的解决方案。 我希望看到在 Linux 上运行的开源解决方案。 但我也欢迎更多通用答案。
首选操作系统: Linux
首选语言: C、Perl、shell 脚本
可能的测量:
- 在一个会话中从服务器到客户端以及从客户端到服务器传输的字节总数; 这也可用于测量协议开销(带宽)
- 一次事务中的往返次数(连接)(延迟)
- 网络运行速度的依赖性(时间 )需要从网络带宽和网络延迟(ping 时间)进行克隆/拉取/推送)
如何进行此类测量(此类基准测试)?
添加时间:2009 年 2 月 6 日:
最简单的基准(测量)是 time
命令的网络版本,即运行的命令将给出传输的字节数,以及执行给定命令期间的往返/网络连接数。
添加时间:2009 年 9 月 6 日:
示例上述 time
命令的网络版本解决方案的假想输出可能如下所示:
$ ntime git clone -q git://git.example.com/repo.git
...
bytes sent: nnn (nn kiB), bytes received: nnn (nn kiB), avg: nn.nn KB/s
nn reads, nn writes
请注意,这只是一个示例输出,详细说明了人们可能想要了解的信息类型得到。
添加时间:2009 年 9 月 6 日:
看起来我想要的一些东西可以使用 dummynet 来实现,它是用于测试网络协议的工具(最初)......
First, a bit of a background. There are many various comparisons of distributed version control systems (DVCS) which compare size of repository, or benchmark speed of operations. I haven't found any that would benchmark network performance of various DVCS, and various protocols used... beside measuring speed of operations (commands) involving network like 'clone', 'pull'/'fetch' or 'push'.
I'd like to know then how would you make such comparison; how to measure network performance of an application, or how to benchmark network protocol. I envision here among others also measuring dependence of performance on both bandwidth of network and latency (ping time) of network; some protocols sacrifice latency in the form of more round-trip exchanges (negotiation) to send minimal required final "pack".
I would prefer solutions involving only one computer, if possible. I'd like to see open source solutions, working on Linux. But I would also welcome more generic answers.
Preferred OS: Linux
Preferred languages: C, Perl, shell script
Possible measurements:
- total number of bytes transferred from server to client and from client to server in one session; this can also be used to measure overhead of protocol (bandwidth)
- number of round-trips (connections) in one transaction (latency)
- dependency of speed of network operation (time it takes to clone/pull/push) from network bandwidth, and from network latency (ping time)
How to make such measurements (such benchmarks)?
Added 02-06-2009:
A simplest benchmark (measurement) would be a network version of time
command, i.e. command which run would give me number of bytes transferred, and number of round trips / network connections during execution of a given command.
Added 09-06-2009:
Example imaginary output for mentioned above solution of network version of time
command could look like the following:
$ ntime git clone -q git://git.example.com/repo.git
...
bytes sent: nnn (nn kiB), bytes received: nnn (nn kiB), avg: nn.nn KB/s
nn reads, nn writes
Note that it is only an example output, detailing kind of information one might want to get.
Added 09-06-2009:
It looks like some of what I want can be achieved using dummynet, tool (originally) for testing networking protocols...
发布评论
评论(2)
可能的答案是使用SystemTap。 在示例脚本中,有 nettop,它在“顶部”显示(一些)所需的网络信息 -就像时尚一样,有 iotime 脚本以所需的形式显示 I/O 信息。
Possible answer would be to use SystemTap. Among example scripts there is nettop which displays (some of) required network information in the "top"-like fashion, and there is iotime script which shows I/O information in required form.
如果我理解正确的话,你基本上对 Linux 'strace' (简介)用于特定于网络的系统调用?
可能是分析器和调试器的组合,用于网络应用程序(即“ntrace”),提供各种可选测量的详细分析?
在 Linux 下,strace 实用程序很大程度上基于 Linux 内核提供的功能,即ptrace(进程跟踪) API:
使用 ptrace,应该可以获取您感兴趣的大部分数据。
在 Windows 上,您可能需要查看 绕道,以便拦截/重定向 Winsock API 调用以进行检查/基准测试。
如果您确实不需要那么多低级信息,您也可以直接使用 strace (在 Linux 上)并仅使用它来跟踪某些系统调用,例如考虑以下行,它仅跟踪对 open 系统调用的调用(使用附加的 -o FILE 参数,您可以将所有输出重定向到输出文件):
strace -e trace=open -o results.log
通过向 strace 传递附加的 -v 标志,您可以增加其详细程度以获得更多信息(当使用像 git 这样的由许多较小的 shell 实用程序和独立工具组成的 SCM 时,您可能还需要考虑使用 -f 标志以便也遵循分叉进程)。
因此,您感兴趣的是与 socket 相关的所有系统调用,即:
只想研究处理send...
简化 您还可以使用“network”作为跟踪参数,这将跟踪所有与网络相关的调用:
因此,相应的 strace 调用可能如下所示:
strace -v -e trace=accept,bind,connect,getpeername,getsockname,getsockopt,listen,recv,recvfrom,send,sendto setsockopt,shutdown,socket,socketpair - o results.log -f git pull
当程序运行完毕后,您主要想检查日志文件以评估数据,这可以通过使用正则表达式轻松实现。
例如,在 Linux shell 中运行以下命令时:
strace -v -o wget.log -e trace=connect,recv,recvfrom,send,sendto wget http://www.google.com
生成的日志文件包含如下消息:
查看这些的手册页两个系统调用,很明显511和20分别是传输的字节数。 如果您还需要详细的计时信息,可以将 -T 标志传递给 strace:
此外,您可以通过传递 -c 标志来获取一些统计信息:
如果您还需要检查处理的实际数据,您可能需要查看读/写说明符:
您还可以自定义字符串的最大长度:
或将字符串转储为十六进制:
因此,使用 strace 来完成大部分工作,似乎是一种很好的混合方法,因为它很容易做到,但仍然有大量的低级信息可用,如果您发现需要额外的低级信息,您可能需要考虑扩展 strace 或者向 sourceforge 上的 strace 项目提交相应的功能请求。
然而,更多地考虑一下,实现相当简单的网络流量基准的一种较少涉及且与平台无关的方法是在客户端和实际服务器之间使用某种形式的中间层:基本上是计量的服务器,分析流量并将其重定向到真实服务器。
非常像代理服务器(例如 SOCKS),以便所有流量都通过您的分析器进行隧道传输,这反过来又可以积累统计数据和其他指标。
类似这样的基本版本可能只需使用 netcat 和一些 shell 脚本就可以轻松组合在一起,但是更复杂的版本可能会受益于使用 perl 或 python。
对于 SOCKS 服务器的 Python 实现,您可能需要查看 pysocks。
另外,当然还有 Python 的 twisted :
不过,如果您确实需要更多底层信息,您可能真的想研究拦截系统调用。
如果您还需要特定于协议的效率数据,您可能需要查看 tcpdump。
If I am understanding you correctly, you are basically interested in something like Linux 'strace' (Introduction) for network-specific system calls?
Possibly a combination of a profiler and a debugger, for network applications (i.e. 'ntrace'), providing a detailed analysis of various optional measurements?
Under Linux, the strace utility is largely based on functionality that is provided by the Linux kernel, namely the ptrace (process tracing) API:
Using ptrace, it should be possible to obtain most of the data that you're interested in.
On Windows, you'll probably want to look into detours in order to intercept/redirect Winsock API calls for inspection/benchmarking purposes.
If you don't really need all that much low level information, you can probably also directly use strace (on linux) and only use it to trace certain system calls, for example consider the following line which would only trace calls to the open syscall (Using the additional -o FILE parameter, you can redirect all output to an output file):
strace -e trace=open -o results.log
By passing an additional -v flag to strace, you can increase its verbosity to get additional information (when working with SCMs like git that are composed of many smaller shell utilities and standalone tools, you'll probably also want to look into using the -f flag in order to also follow forked processes).
So, what you would be interested in, is all syscalls that are related to sockets, namely:
(in the beginning, you'll probably only want to look into dealing with the send.../recv... calls, though)
To simplify this, you can also use "network" as parameter to trace, which will trace all network-related calls:
So, a corresponding strace invocation could look like this:
strace -v -e trace=accept,bind,connect,getpeername,getsockname,getsockopt,listen,recv,recvfrom,send,sendto setsockopt,shutdown,socket,socketpair -o results.log -f git pull
When the program is finished running, you'll then mainly want to examine the log file to evaluate the data, this can then be easily achieved by using regular expressions.
For example, when running the following in a linux shell:
strace -v -o wget.log -e trace=connect,recv,recvfrom,send,sendto wget http://www.google.com
The resulting log file contains messages like these:
Looking at the man pages for these two system calls, it's obvious that 511 and respectively 20 are the number of bytes that are transferred. If you also need detailed timing information, you can pass the -T flag to strace:
In addition, you can get some statistics by passing the -c flag:
If you also need to examine the actual data processed, you may want to look into the read/write specifiers:
You can also customize the max length of strings:
Or have strings be dumped as hex:
So, using strace for much of this, seems like a good hybrid approach, because it is very easy to do, but still there's a good amount of low level information available, if you find that you need additional low level information, you may want to consider extending strace instead or filing corresponding feature requests with the strace project on sourceforge.
However, thinking some more about it, a less involved and more platform-agnostic way of implementing a fairly simple network traffic benchmark, would be to use some form of intermediate layer, in between the client and the actual server: a server that's basically metering, analyzing and redirecting the traffic to the real server.
Pretty much like a proxy server (e.g SOCKS), so that all traffic is tunneled through your analyzer, which can in turn accumulate statistics and other metrics.
A basic version of something like this could probably be easily put together just by using netcat and some shell scripts, more complex versions may however benefit from using perl or python instead.
For a python implementation of a SOCKS server, you may want to look into pysocks.
Also, there's of course twisted for python:
If you do need to have more low level information, you'll probably really want to look into intercepting system calls, though.
If you also need protocol-specific efficiency data, you might want to look into tcpdump.