计算 Web 服务下载的吞吐量
我有一个客户端服务器应用程序,它交换 XML 文档以获取客户端请求的数据。本质上,用户输入一些搜索约束(要匹配的属性),客户端与两个系统通信以取回数据(一些数据来自数据库,一些数据来自文件服务器)。
从文件服务器返回的数据(存档数据的文件)比从服务器返回的元数据大很多,并且相应地需要更多的时间来执行。
用户要求我提供一些关于下载存档数据需要多长时间以及下载速率(下载后)的指标。
客户端服务器与异步 I/O 和大量线程进行通信,因此我不能仅使用启动/停止计时器来完成此任务。
我当前的实现是这样工作的:
- 记录当前的滴答声(这是一个长时间运行的过程,因此滴答声分辨率很好)
- 将请求异步传递给 Web 服务。
- --- 等待 ---
- 获取当前刻度
- 获取返回文档的大小(SOAP 信封中没有考虑到一些开销,但我认为这是可以的)
- Rate = (Document Size / 1024) / (End Ticks - Start Ticks) * Ticks/Second (我让一个时间跨度对象执行此操作)
起初我认为这种方法没问题,但我有用户报告说小样本的速率比大样本的速率低得多,并且速率单次执行时差异很大。
有没有更好的方法来计算这个比率,从而更不受此影响?较大档案的速率会更高,这是有道理的,但在测试中我发现它比具有相同大小的文件高 10-40 倍,这是没有意义的。
I have a client server application which exchanges XML documents for data requested by the client. Essentially the user enters some search constraints (attributes to match) and the client communicates with two systems to get data back (some data from a database and some data from file servers).
The data returned from the file servers (files of archived data) are quite a bit bigger than the metadata returned from the server, and correspondingly takes more time to perform.
The users have asked me to provide some metrics on how long it takes to download the archive data and the rate at which it is being downloaded (after the download).
The client server communicate with asyncronous I/O and numerous threads so I cannot just use a Start/Stop timer to accomplish this.
My current implementation works as such:
- Record the current Ticks (this is a long running process so tick resolution is fine)
- Hand off the request to the Webservice Asyncronously.
- --- Wait ---
- get the current ticks
- get the size of the document returned (there is some overhead not accounted for from the SOAP envelope but this is ok, I think)
- Rate = (Document Size / 1024) / (End Ticks - Start Ticks) * Ticks/Second (I let a timespan object do this)
At first I thought this method was ok, but I have user reporting that the rate is much lower for small samples than it is for large samples and that the rates vary a great deal over a single execution.
Is there a better way to calculate this rate that would be more immune to this? It makes sense that the rate will be greater for larger archives, but in testing I see it being 10-40x higher than for a file have the size, which doesnt make sense.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
问题中测量的吞吐量假设传输时间是均匀的。它不是。会话开始时存在设置成本,包括 TCP 3 次握手和生成结果所需的服务器时间。设置完成后,其余部分主要由网络吞吐量决定。
对于大型有效负载,设置时间仅占总传输时间的一小部分,因此计算出的吞吐量接近您的预期。对于小型有效负载,测量的时间主要是设置时间!因此,计算出的吞吐量可能会降低几个数量级。
你能做什么?找到一种方法从方程式中删除设置组件。
如果您可以在数据开始到达时收到通知,则可以在那里开始滴答计数。这应该适用于除最短响应之外的所有响应(其中内容适合单个网络数据包)。
或者,让服务器在发送响应之前将时间戳附加到响应上。您可以使用它作为开始时间,注意调整机器之间的任何时钟差异。
The throughput as measured in the question assumes the transfer time is homogenous. It is not. There is a setup cost at the beginning of the session that includes the TCP 3-way handshake and the server time required to produce the result. Once setup is complete, the rest is dominated mostly by the network throughput.
For large payloads, the setup time is a tiny fraction of the overall transfer time, and hence the calculated throughput approximates what you'd expect. For small payloads, the time measured is mostly setup time! As a result, the computed throughput could be off by orders of magnitude.
What can you do? Find a way to drop the setup components from the equation.
If you can get a notification when data starts arriving, you could start the tick count there. This should work for all but the shortest responses (where the content fits within a single network packet.)
Alternatively, have the server attach a timestamp to the response just before sending it. You could use that as the start time, taking care to adjust for any clock differences between the machines.