如何可视化许多并发多阶段流程的行为?
假设我有大量(连续的流)请求需要处理,并且每个请求都有几个阶段。例如:“连接到数据源”、“从数据源读取数据”、“验证数据”、“处理数据”、“连接到数据接收器”、“将结果写入数据接收器”。
哪些可视化方法甚至工具非常适合可视化此类系统的行为?
我希望能够看到哪些阶段花费了很长时间,以及不同请求的阶段如何相互对齐(例如,查看当被太多请求访问时数据源的响应时间更长)一次)。
如果只有几十个请求,我可以接受几十个单独的彩色时间线,但对于几千个则不太合适。我想我可以摆脱 N 种彩色时间线,其中 N 是“并发因素”,但是 1)也许有更好的东西,2)也许有这方面的工具?
PS 无耻插件:一旦我找到了可视化的最佳方式,我会将其添加到我的漂亮工具中,名为 timeplot ;)
PPS 另一个无耻的插件:我决定编写一个单独的工具: splot。基于一个非常简单的日志和 awk 单行代码,它可以执行以下操作:
它显示了 160 个内核执行 RabbitMQ 提供给它们的任务的集群。蓝色是“获取数据”,橙色是“计算”,白色是“什么也不做”。从该图中可以立即看出几个问题,仅通过查看日志很难发现这些问题。
Suppose I've got a ton (a continuous stream) of requests to process, and each request has several stages. For example: "connecting to data source", "reading data from data source", "validating data", "processing data", "connecting to data sink", "writing result to data sink".
Which visualization methods or even tools fit well to visualize the behavior of such a system?
I'd like to be able to see which stages are taking a long time, and how the stages of different requests are aligned with respect to each other (for example, to see that the data source responds longer when accessed by too many requests at once).
If there were just a few dozen requests, I'd be OK with a few dozen individual colored timelines, but for a few thousand that doesn't fit well. I think I can get away with N colored timelines, where N is the "concurrency factor", but 1) perhaps there's something better, 2) perhaps there exist tools for this?
P.S. Shameless plug: Once I figure out the best way of visualization, I'll add it to my nifty tool called timeplot ;)
P.P.S. Another shameless plug: I decided to write a separate tool: splot. Here's what it can do, based on a trivially simple log and an awk one-liner:
It's showing 160 cores of a cluster performing tasks fed to them by RabbitMQ. Blue is "fetching data", orange is "computing", white is "doing nothing". Several problems are immediately obvious from this diagram, which would be very hard to find by just looking at the logs.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我有一个多进程软件,在 15 个内核的机器上运行。这就是我所做的。
将所有消息记录到系统日志。最后在 http://www.simile-widgets.org/timeline< 上绘制(选定)最后 20 分钟的日志数据/a>.为了密切关注记录的内容和模式,我使用系统日志查看器。有很多,您可以找到适合您的那一款。 http ://www.google.com/search?aq=0&oq=syslog+vi&sourceid=chrome&ie=UTF-8&q=syslog+viewer
希望这有帮助。
I have a multi process software that runs on a machine with 15 cores. Here is what I do.
Log all messages to syslog. Finally plot (selected) last 20 minutes log data on http://www.simile-widgets.org/timeline. To keep an eye on what is getting logged when and the patterns I use syslog viewer. There are plenty you can find the one that suits you. http://www.google.com/search?aq=0&oq=syslog+vi&sourceid=chrome&ie=UTF-8&q=syslog+viewer
Hope this helps.