当前位置：文江博客话题详情

日志记录级别 - Logback - 分配日志级别的经验法则

发布于 2024-12-10 18:41:26 字数 267 浏览 0 评论 0原文

我在当前项目中使用 logback 。

它提供了六个级别的日志记录： TRACE DEBUG INFO WARN ERROR OFF

我正在寻找一个经验法则来确定常见活动的日志级别。例如，如果线程被锁定，日志消息应该设置为调试级别还是信息级别。或者，如果正在使用套接字，则应在调试级别或跟踪级别记录其特定 ID。

我将感谢为每个日志记录级别提供更多示例的答案。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

琉璃繁缕 2024-12-17 18:41:26

我主要构建大规模、高可用性类型的系统，所以我的答案偏向于从生产支持的角度来看；也就是说，我们大致分配如下：

错误：系统陷入困境，客户可能受到影响（或很快就会受到影响），修复可能需要人工干预。 “凌晨 2 点规则”在这里适用 - 如果您正在值班，如果发生这种情况，您是否希望在凌晨 2 点被叫醒？如果是，则将其记录为“错误”。
警告：发生意外的技术或业务事件，客户可能会受到影响，但可能不需要立即人工干预。待命人员不会立即接到电话，但支持人员会希望尽快审查这些问题以了解影响是什么。基本上任何需要跟踪但可能不需要立即干预的问题。
信息：我们希望大量查看的信息，以便我们需要对问题进行取证分析。系统生命周期事件（系统启动、停止）位于此处。 “会话”生命周期事件（登录、注销等）位于此处。还应考虑重要的边界事件（例如数据库调用、远程 API 调用）。典型的业务异常可以放在此处（例如，由于凭据错误而导致登录失败）。您认为需要在生产中大量查看的任何其他事件都位于此处。
调试：几乎所有不影响“信息”的内容...任何有助于跟踪系统流程和隔离问题的消息，尤其是在开发和质量检查期间阶段。我们使用“调试”级别日志来进入/退出大多数重要方法，并在方法内标记有趣的事件和决策点。
trace：我们不经常使用此功能，但这适用于极其详细且可能大量的日志，即使在正常开发期间，您通常也不希望启用这些日志。示例包括转储完整的对象层次结构、在大循环的每次迭代期间记录某些状态等。

与选择正确的日志级别一样或更重要的是确保日志有意义并具有所需的上下文。例如，您几乎总是希望在日志中包含线程 ID，以便在需要时可以跟踪单个线程。您可能还想采用一种机制将业务信息（例如用户 ID）与线程关联起来，以便它也被记录下来。在日志消息中，您需要包含足够的信息以确保该消息可操作。像“ FileNotFound 异常捕获”这样的日志并不是很有帮助。更好的消息是“尝试打开配置文件时捕获 FileNotFound 异常：/usr/local/app/somefile.txt。userId=12344。”

还有许多很好的日志记录指南...例如，这是来自 JCL（雅加达共享日志记录）：

错误 - 其他运行时错误或意外情况。预计这些将立即在状态控制台上可见。
警告 - 使用已弃用的 API、API 使用不当、“几乎”错误、其他不希望或意外的运行时情况，但不是
必然是“错误”的。预计这些将立即在
状态控制台。
info - 有趣的运行时事件（启动/关闭）。预计这些会立即在控制台上可见，因此要保守并坚持
最低限度。
调试 - 有关系统流程的详细信息。期望这些仅写入日志。
trace - 更详细的信息。希望这些仅写入日志。

I mostly build large scale, high availability type systems, so my answer is biased towards looking at it from a production support standpoint; that said, we assign roughly as follows:

error: the system is in distress, customers are probably being affected (or will soon be) and the fix probably requires human intervention. The "2AM rule" applies here- if you're on call, do you want to be woken up at 2AM if this condition happens? If yes, then log it as "error".
warn: an unexpected technical or business event happened, customers may be affected, but probably no immediate human intervention is required. On call people won't be called immediately, but support personnel will want to review these issues asap to understand what the impact is. Basically any issue that needs to be tracked but may not require immediate intervention.
info: things we want to see at high volume in case we need to forensically analyze an issue. System lifecycle events (system start, stop) go here. "Session" lifecycle events (login, logout, etc.) go here. Significant boundary events should be considered as well (e.g. database calls, remote API calls). Typical business exceptions can go here (e.g. login failed due to bad credentials). Any other event you think you'll need to see in production at high volume goes here.
debug: just about everything that doesn't make the "info" cut... any message that is helpful in tracking the flow through the system and isolating issues, especially during the development and QA phases. We use "debug" level logs for entry/exit of most non-trivial methods and marking interesting events and decision points inside methods.
trace: we don't use this often, but this would be for extremely detailed and potentially high volume logs that you don't typically want enabled even during normal development. Examples include dumping a full object hierarchy, logging some state during every iteration of a large loop, etc.

As or more important than choosing the right log levels is ensuring that the logs are meaningful and have the needed context. For example, you'll almost always want to include the thread ID in the logs so you can follow a single thread if needed. You may also want to employ a mechanism to associate business info (e.g. user ID) to the thread so it gets logged as well. In your log message, you'll want to include enough info to ensure the message can be actionable. A log like " FileNotFound exception caught" is not very helpful. A better message is "FileNotFound exception caught while attempting to open config file: /usr/local/app/somefile.txt. userId=12344."

There are also a number of good logging guides out there... for example, here's an edited snippet from JCL (Jakarta Commons Logging):

error - Other runtime errors or unexpected conditions. Expect these to be immediately visible on a status console.
warn - Use of deprecated APIs, poor use of API, 'almost' errors, other runtime situations that are undesirable or unexpected, but not
necessarily "wrong". Expect these to be immediately visible on a
status console.
info - Interesting runtime events (startup/shutdown). Expect these to be immediately visible on a console, so be conservative and keep to
a minimum.
debug - detailed information on the flow through the system. Expect these to be written to logs only.
trace - more detailed information. Expect these to be written to logs only.

回复收藏 0 原文

难以启齿的温柔 2024-12-17 18:41:26

我认为更多的是从开发而不是运营的角度来看，我的方法是：

错误意味着某些任务的执行无法完成；无法发送电子邮件、无法呈现页面、无法将某些数据存储到数据库等。肯定出了什么问题。
警告意味着发生了意外的事情，但执行可以继续，也许是在降级模式下；配置文件丢失，但使用了默认值，价格计算为负数，因此被限制为零，等等。有些事情不对劲，但还没有完全出错 - 警告通常是一个迹象，表明将会出现很快就会出现错误。
信息表示发生了正常但重要的事情；系统启动、系统停止、每日库存更新作业运行等等。不应该有连续的这些内容，否则有太多东西需要阅读。
调试表示发生了一些正常且无关紧要的事情；新用户来到该网站，呈现了一个页面，接受了订单，更新了价格。这是从信息中排除的内容，因为信息太多了。
Trace是我从未实际使用过的东西。

回复收藏 0 原文

背叛残局 2024-12-17 18:41:26

这也可能有切向帮助，了解特定级别的日志记录请求（来自代码）是否会导致在给定部署的有效日志记录级别的情况下实际记录它配置有.从此处的其他答案中确定您想要配置部署的有效级别，然后参考此内容以查看是否会实际记录代码中的特定日志记录请求那么...

例如：

“在 WARN 处记录的日志代码行实际上会记录在配置了 ERROR 的部署中吗？”表格上说，不。
“在 WARN 处记录的日志代码行实际上会记录在配置了 DEBUG 的部署中吗？”表上说，是的。

来自 logback 文档：

以更形象的方式，以下是选择规则的工作原理。在下表中，垂直标头显示记录请求的级别，用 p 指定，而水平标头显示记录器的有效级别，用 q 指定。行（级别请求）和列（有效级别）的交集是由基本选择规则产生的布尔值。

因此，请求日志记录的代码行只有在其部署的有效日志记录级别小于或等于该代码行的请求时才会实际记录> 严重程度。

回复收藏 0 原文

秋意浓 2024-12-17 18:41:26

我从基于组件的架构中回答这个问题，在该架构中，组织可能运行许多相互依赖的组件。在传播故障期间，日志记录级别应有助于识别哪些组件受到影响以及哪些组件是根本原因。

错误 - 该组件发生故障，原因被认为是内部的（任何内部的、未处理的异常、封装依赖项的故障...例如数据库、REST 示例将是它已收到来自依赖项的 4xx 错误）。让我（该组件的维护者）起床。
警告 - 此组件出现故障，据信是由依赖组件引起的（REST 示例是依赖项的 5xx 状态）。让该组件的维护者起床。
信息 - 我们想要向操作员提供的任何其他信息。如果您决定记录快乐路径，那么我建议将每个重要操作（例如每个传入的 http 请求）限制为 1 条日志消息。

对于所有日志消息，请务必记录有用的上下文（并优先考虑使消息易于阅读/有用，而不是包含大量“错误代码”）