如何调试大型服务器端分布式Java应用程序

发布于 2024-10-02 17:14:03 字数 331 浏览 1 评论 0原文

这是我的问题:我正在尝试调试 Apache Cassandra 并了解应用程序的流程。即当客户端发送请求时,例如 put(),调用哪些方法以及系统内部如何工作。

所以,这就是我的想法:

  1. 在 cassandra 代码中编写一个 main 方法,该方法调用入口点 put() 方法,在 eclipse 等中放置断点等,或者
  2. 不要编写 main 方法,只需使用常规客户端(它访问通过 TCP 服务器)和使用 log4j 记录器(已在 cassandra 中实现)“调试”(通过读取日志文件并理解代码)。

所以,我的问题是,调试此类分布式应用程序的理想方法是什么?

Here is my problem: I am trying to debug Apache Cassandra and understand the flow of the app. I.e. when a request is sent by the client, say put(), what methods are called and how the system is working internally.

So, here is what I am thinking:

  1. Write a main method in the cassandra code which calls the point of entry put() method, put breakpoints in eclipse etc etc OR
  2. Don't write a main method, simply use regular client (which accesses server via TCP) and "debug" (by reading the log files and understanding the code) using log4j loggers (already implemented in cassandra).

So, my question is, what is the ideal way of debugging such a distributed application?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

虚拟世界 2024-10-09 17:14:03

理想的方式?两者兼而有之,甚至更多。

您提到了目标:“调试”和“理解应用程序的流程” - 好吧,在理解流程之前很难进行调试,但理解本身可能就是目的。

在现实世界中,在处理大型分布式系统时通常不能依赖调试器,至少在最初是这样,尤其是因为有些问题仅在系统繁忙或运行数小时后才会出现。因此,在应用程序代码和基础设施代码中良好的调试跟踪以及对该跟踪的细粒度控制至关重要。

但是,如果您有机会在调试器中运行,这可能会很有启发。

在这一切之前,我认为你需要:

a)。研究可能存在的任何设计文档。

b).在一个好的 IDE 中浏览源代码,例如。蚀。只要跟着控制走就可以了。嗯,这里有一个有趣的地方,想知道它是从哪里调用的?调用类上的该方法,它有什么作用?该构造函数什么时候被调用?

有了这些,跟踪跟踪就容易多了,并且您可以更好地知道在哪里放置断点。

Ideal way? Both, and more.

You mentioned objectives: "debug" and "understand the flow of the application" - OK it's very hard to debug before you do understand the flow, but understanding may be an end in itself.

In the real world, when dealing with large distributed systems on often cannot rely on debuggers, at least initially, not least because some problems only show up when the system is busy or after hours of running. Hence good debug trace, and fine-grained control over that trace, in the application code and infrastructure code is essential.

However if you have the opportunity to run in a debugger that can be quite illuminating.

Before all of that I think you need to:

a). Study any design documentation that there may be.

b). Browse the source code in a good IDE, eg. Eclipse. Just follow the control. Hmmm here's an interesting bit, wonder where it gets called from? Call to that method on a class, what does that do? When does that constructor get called?

With some of that in your head followng the trace is much easier, and you have a better idea where to put the breakpoints.

黑寡妇 2024-10-09 17:14:03

如何使用log4j的MDC,设置正确在 put() 之前,然后在 put() 退出后清除它?然后,您可以看到其中到底发生了什么,只要您在 put() 内执行的方法中设置了其他日志记录。如果您在该方法中深处,请时不时地记录堆栈跟踪,以便您可以看到当前所在的位置。

免责声明:我的调试优先级列表如下:

  1. 检查堆栈跟踪
  2. 检查日志文件
  3. 使用调试器

因此,如果 1. 和 2. 没有给我答案,我将诉诸调试器。

在这样的分布式应用程序中,使用调试器听起来像是最后的手段。

How about using log4j's MDC, setting it right before put() and then clearing it after put() has exited? Then you can see what really happened in there, providing you have other logging set up in methods that are executed inside put(). If you are somewhere deep in that method, log the stack trace now and then, so you can see where you're currently.

Disclaimer: My debugging priority list goes like this:

  1. examine stack trace
  2. examine log files
  3. use a debugger

So, if 1. and 2. don't give me an answer, I will resort to a debugger.

In a distributed app like this, using a debugger sounds like a last resort thing.

把时间冻结 2024-10-09 17:14:03

在分布式应用程序中使用日志记录确实是了解更广泛范围内实际发生的情况以及事物如何交互的最佳方法之一。但您最终将面临日志文件的问题 - 分布式系统可以生成大量不同格式和位置的日志文件。因此,如果您想使用 log4j (或类似的)来完成类似的事情,您应该将日志聚合到一个地方,然后研究它们。 这个工具可能会有所帮助, - 它不仅允许持久聚合,还可以实时监控来自各种不同的聚合日志流来源。例如,您可以关注特定主机(或主机范围)的数据层并实时观察正在发生的情况。或者,您可以从特定机器上的特定线程获取日志,或者使用 MDC 上下文,如上一张海报中已经提到的。我也同意这样的观点:分布式应用程序中的调试器在大多数情况下都是无用的,并且出于明显的原因在生产系统中完全无用。另一方面,Log4j 非常灵活,使用广泛,并且是最好的日志记录工具之一(恕我直言)。

Using logging in distributed application is indeed one of the best ways to figure out what actually happens on a wider scale and how things interact. But you will eventually face a problem with log files - distributed systems can generate lots of them, in various formats and locations. So if you want to use log4j (or alike) for stuff like this, you should aggregate logs into one place and then study them. This tool might help, - it allows not only persisted aggregation, but also real-time monitoring of aggregated log stream from various sources. For example, you can focus on data layer from particular host (or range of hosts) and observe in real-time what's going on. Alternatively you can fetch logs from particular thread on a particular machine or use MDC context like mentioned already by previous poster. I am also subscribing to the view that debugger in distributed apps is useless most of the time and is totally useless in production systems for obvious reasons. Log4j on the other hand is incredibly flexible, used widely and is one of the best tools (IMHO) for logging.

錯遇了你 2024-10-09 17:14:03

使用日志,如果需要,请增加日志级别添加更多日志语句。在分布式系统的不同组件,分析不同的组件,如数据库、应用程序服务器、分析堆栈跟踪、在 Web 应用程序内置的前端浏览器以及后端断点上使用调试工具

Use logs, increase log levels if required add more log statements. at different components of the distributed system, profile different components like database, application server, analyze stack trace, use debugging tools on front-end browser built-in if web app and as well as back-end breakpoints

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文