何时使用不同的日志级别

发布于 2024-08-16 23:00:06 字数 291 浏览 4 评论 0原文

记录消息的方法有多种,按严重程度排列:

  1. FATAL

  2. ERROR

    >
  3. 警告

  4. 信息

  5. 调试

  6. TRACE

我如何决定何时使用哪个?

使用什么是好的启发法?

There are different ways to log messages, in order of fatality:

  1. FATAL

  2. ERROR

  3. WARN

  4. INFO

  5. DEBUG

  6. TRACE

How do I decide when to use which?

What's a good heuristic to use?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(23

一刻暧昧 2024-08-23 23:00:06

我通常遵循以下约定:

  • 跟踪 - 仅当我“跟踪”代码并尝试专门查找函数的部分时。
  • 调试 - 对诊断有帮助的信息,而不仅仅是开发人员(IT、系统管理员等)。
  • 信息 - 通常需要记录的有用信息(服务启动/停止、配置假设等)。我希望始终拥有可用的信息,但在正常情况下通常不关心。这是我的开箱即用的配置级别。
  • 警告 - 任何可能导致应用程序异常的情况,但我会自动恢复。 (例如从主服务器切换到备份服务器、重试操作、丢失辅助数据等)
  • 错误 - 任何对操作致命的错误,但不是致命的服务或应用程序(无法打开所需文件、丢失数据等)。这些错误将迫使用户(管理员或直接用户)干预。这些通常是为不正确的连接字符串、缺失的服务等保留的(在我的应用程序中)。
  • 致命 - 强制关闭服务或应用程序以防止数据丢失(或进一步数据丢失)的任何错误。我仅将这些保留用于最严重的错误以及保证数据损坏或丢失的情况。

I generally subscribe to the following convention:

  • Trace - Only when I would be "tracing" the code and trying to find one part of a function specifically.
  • Debug - Information that is diagnostically helpful to people more than just developers (IT, sysadmins, etc.).
  • Info - Generally useful information to log (service start/stop, configuration assumptions, etc). Info I want to always have available but usually don't care about under normal circumstances. This is my out-of-the-box config level.
  • Warn - Anything that can potentially cause application oddities, but for which I am automatically recovering. (Such as switching from a primary to backup server, retrying an operation, missing secondary data, etc.)
  • Error - Any error which is fatal to the operation, but not the service or application (can't open a required file, missing data, etc.). These errors will force user (administrator, or direct user) intervention. These are usually reserved (in my apps) for incorrect connection strings, missing services, etc.
  • Fatal - Any error that is forcing a shutdown of the service or application to prevent data loss (or further data loss). I reserve these only for the most heinous errors and situations where there is guaranteed to have been data corruption or loss.
若沐 2024-08-23 23:00:06

您希望该消息能够让系统管理员在半夜起床吗?

  • 是的->错误
  • 号->警告

Would you want the message to get a system administrator out of bed in the middle of the night?

  • yes -> error
  • no -> warn
ま柒月 2024-08-23 23:00:06

这是一个古老的话题,但仍然具有现实意义。本周,我为我的同事写了一篇关于它的小文章。为此,我还创建了这个备忘单,因为我在网上找不到任何备忘单。

备忘单:我应该使用哪个日志级别

It's an old topic, but still relevant. This week, I wrote a small article about it, for my colleagues. For that purpose, I also created this cheat sheet, because I couldn't find any online.

cheat sheet: which log level should I use

一指流沙 2024-08-23 23:00:06

我发现从查看日志文件的角度考虑严重性更有帮助。

致命/严重<​​/strong>:应立即调查的整体应用程序或系统故障。是的,唤醒系统管理员。由于我们更喜欢系统管理员的警觉和休息良好,因此这种严重程度应该很少使用。如果它每天都在发生,而且不是 BFD,那么它就失去了意义。通常,致命错误在进程生命周期中仅发生一次,因此如果日志文件与进程绑定,则这通常是日志中的最后一条消息。

错误:绝对是一个应该调查的问题。系统管理员应该会自动收到通知,但不需要被拖下床。通过过滤日志来查看错误及上述错误,您可以大致了解错误频率,并可以快速识别可能导致一系列其他错误的初始故障。跟踪错误率与应用程序使用情况可以产生有用的质量指标,例如可用于评估整体质量的 MTBF。例如,此指标可能有助于决定在发布之前是否需要另一个 Beta 测试周期。

警告:这可能是问题,也可能不是。例如,预期的瞬态环境条件(例如网络或数据库连接短暂丢失)应记录为警告,而不是错误。查看过滤后仅显示警告和错误的日志可以快速洞察后续错误根本原因的早期提示。应谨慎使用警告,以免变得毫无意义。例如,网络访问丢失应该是服务器应用程序中的警告甚至错误,但可能只是专为偶尔断开连接的笔记本电脑用户设计的桌面应用程序中的信息。

信息:这是在正常情况下应记录的重要信息,例如成功初始化、服务启动和停止或成功完成重要事务。查看显示信息及以上内容的日志应该可以快速概述流程中的主要状态更改,提供顶级上下文以了解也发生的任何警告或错误。不要有太多信息消息。我们通常有< 5% 与 Trace 相关的信息消息。

跟踪:跟踪是迄今为止最常用的严重性,应提供上下文以了解导致错误和警告的步骤。拥有正确的 Trace 消息密度可以使软件更易于维护,但需要一些努力,因为随着程序的发展,各个 Trace 语句的值可能会随着时间而变化。实现这一目标的最佳方法是让开发团队养成定期检查日志的习惯,作为解决客户报告的问题的标准部分。鼓励团队删除不再提供有用上下文的跟踪消息,并在需要时添加消息以了解后续消息的上下文。例如,记录用户输入(例如更改显示或选项卡)通常很有帮助。

调试:我们认为调试痕迹。区别在于调试消息是从发布版本中编译出来的。也就是说,我们不鼓励使用调试消息。允许调试消息往往会导致添加越来越多的调试消息,但不会删除任何消息。随着时间的推移,这使得日志文件几乎毫无用处,因为从噪声中过滤信号太困难了。这导致开发人员不使用继续死亡螺旋的日志。相反,不断修剪跟踪消息会鼓励开发人员使用它们,从而形成良性循环。此外,这还消除了由于调试代码中未包含在发布版本中所需的副作用而引入错误的可能性。是的,我知道在好的代码中不应该发生这种情况,但安全总比后悔好。

I find it more helpful to think about severities from the perspective of viewing the log file.

Fatal/Critical: Overall application or system failure that should be investigated immediately. Yes, wake up the SysAdmin. Since we prefer our SysAdmins alert and well-rested, this severity should be used very infrequently. If it's happening daily and that's not a BFD, it has lost its meaning. Typically, a Fatal error only occurs once in the process lifetime, so if the log file is tied to the process, this is typically the last message in the log.

Error: Definitely a problem that should be investigated. SysAdmin should be notified automatically, but doesn't need to be dragged out of bed. By filtering a log to look at errors and above you get an overview of error frequency and can quickly identify the initiating failure that might have resulted in a cascade of additional errors. Tracking error rates as versus application usage can yield useful quality metrics such as MTBF which can be used to assess overall quality. For example, this metric might help inform decisions about whether or not another beta testing cycle is needed before a release.

Warning: This MIGHT be problem, or might not. For example, expected transient environmental conditions such as short loss of network or database connectivity should be logged as Warnings, not Errors. Viewing a log filtered to show only warnings and errors may give quick insight into early hints at the root cause of a subsequent error. Warnings should be used sparingly so that they don't become meaningless. For example, loss of network access should be a warning or even an error in a server application, but might be just an Info in a desktop app designed for occasionally disconnected laptop users.

Info: This is important information that should be logged under normal conditions such as successful initialization, services starting and stopping or successful completion of significant transactions. Viewing a log showing Info and above should give a quick overview of major state changes in the process providing top-level context for understanding any warnings or errors that also occur. Don't have too many Info messages. We typically have < 5% Info messages relative to Trace.

Trace: Trace is by far the most commonly used severity and should provide context to understand the steps leading up to errors and warnings. Having the right density of Trace messages makes software much more maintainable but requires some diligence because the value of individual Trace statements may change over time as programs evolve. The best way to achieve this is by getting the dev team in the habit of regularly reviewing logs as a standard part of troubleshooting customer reported issues. Encourage the team to prune out Trace messages that no longer provide useful context and to add messages where needed to understand the context of subsequent messages. For example, it is often helpful to log user input such as changing displays or tabs.

Debug: We consider Debug < Trace. The distinction being that Debug messages are compiled out of Release builds. That said, we discourage use of Debug messages. Allowing Debug messages tends to lead to more and more Debug messages being added and none ever removed. In time, this makes log files almost useless because it's too hard to filter signal from noise. That causes devs to not use the logs which continues the death spiral. In contrast, constantly pruning Trace messages encourages devs to use them which results in a virtuous spiral. Also, this eliminates the possibility of bugs introduced because of needed side-effects in debug code that isn't included in the release build. Yeah, I know that shouldn't happen in good code, but better safe than sorry.

清音悠歌 2024-08-23 23:00:06

这是“伐木者”所拥有的清单。


Apache log4j:§1,< a href="https://logging.apache.org/log4j/2.0/log4j-api/apidocs/org/apache/logging/log4j/Level.html">§2

  1. 致命

    <块引用>

    [v1.2:..]非常严重的错误事件,可能会导致应用程序中止。

    [v2.0:..]严重错误,将阻止应用程序继续运行。

  2. 错误

    <块引用>

    [v1.2:..]可能仍允许应用程序继续运行的错误事件。

    [v2.0:..]应用程序中出现错误,可能可以恢复。

  3. 警告

    <块引用>

    [v1.2:..]潜在的有害情况。

    [v2.0:..]可能[sic]导致错误的事件。

  4. 信息

    <块引用>

    [v1.2:..]在粗粒度级别突出显示应用程序进度的信息性消息。

    [v2.0:..] 仅供参考的事件。

  5. 调试

    <块引用>

    [v1.2:..]对于调试应用程序最有用的细粒度信息事件。

    [v2.0:..]常规调试事件。

  6. 跟踪

    <块引用>

    [v1.2:..]比DEBUG更细粒度的信息事件。

    [v2.0:..]细粒度的调试消息,通常捕获通过应用程序的流程。


Apache Httpd(像往常一样)喜欢过度杀伤:§< /a>

  1. 紧急

    <块引用>

    紧急情况 - 系统无法使用。

  2. 警报

    <块引用>

    必须立即采取行动[但系统仍然可用]。

  3. 暴击

    <块引用>

    严重情况[但无需立即采取行动]。

    • 套接字:获取套接字失败,正在退出子进程
  4. 错误

    <块引用>

    错误情况[但不严重]。

    • 脚本标头提前结束
  5. 警告

    <块引用>

    警告条件。 [接近错误,但不是错误]

  6. 注意

    <块引用>

    正常但重要的[值得注意]情况。

    • httpd:捕获 SIGBUS,尝试将核心转储到...
  7. 信息

    <块引用>

    信息性[且不值得注意]。

    • [“服务器已运行 x 小时。”]
  8. 调试

    <块引用>

    调试级消息[,即为了调试而记录的消息)]。

    • 打开配置文件...
  9. trace1trace6

    <块引用>

    跟踪消息[,即为了跟踪而记录的消息]。

    • 代理:FTP:控制连接完成
    • “代理:CONNECT:将 CONNECT 请求发送到远程代理”
    • openssl:握手:开始
    • 从缓冲 SSL 数据组读取,模式 0,17 字节
    • 地图查找失败:map=rewritemap key=keyname
    • 缓存查找失败,强制进行新的地图查找
  10. trace7trace8

    <块引用>

    跟踪消息,转储大量数据

    • | 0000: 02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 |
    • | 0000:02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 |

Apache commons-logging: §

  1. 致命

    <块引用>

    导致提前终止的严重错误。预计这些将立即在状态控制台上可见。

  2. 错误

    <块引用>

    其他运行时错误或意外情况。预计这些将立即在状态控制台上可见。

  3. 警告

    <块引用>

    使用已弃用的 API、API 使用不当、“几乎”错误、其他不希望或意外但不一定“错误”的运行时情况。预计这些将立即在状态控制台上可见。

  4. 信息

    <块引用>

    有趣的运行时事件(启动/关闭)。预计这些会立即在控制台上可见,因此要保守并保持在最低限度。

  5. 调试

    <块引用>

    有关系统流程的详细信息。期望这些仅写入日志。

  6. 跟踪

    <块引用>

    更详细的信息。期望这些仅写入日志。

Apache commons-logging 企业使用的“最佳实践”根据调试信息跨越的边界类型对它们进行了区分。

边界包括:

  • 外部边界 - 预期异常。

  • 外部边界 - 意外异常。

  • 内部边界。

  • 重要的内部边界。

(有关详细信息,请参阅commons-logging 指南。)

Here's a list of what "the loggers" have.


Apache log4j: §1, §2

  1. FATAL:

    [v1.2: ..] very severe error events that will presumably lead the application to abort.

    [v2.0: ..] severe error that will prevent the application from continuing.

  2. ERROR:

    [v1.2: ..] error events that might still allow the application to continue running.

    [v2.0: ..] error in the application, possibly recoverable.

  3. WARN:

    [v1.2: ..] potentially harmful situations.

    [v2.0: ..] event that might possible [sic] lead to an error.

  4. INFO:

    [v1.2: ..] informational messages that highlight the progress of the application at coarse-grained level.

    [v2.0: ..] event for informational purposes.

  5. DEBUG:

    [v1.2: ..] fine-grained informational events that are most useful to debug an application.

    [v2.0: ..] general debugging event.

  6. TRACE:

    [v1.2: ..] finer-grained informational events than the DEBUG.

    [v2.0: ..] fine-grained debug message, typically capturing the flow through the application.


Apache Httpd (as usual) likes to go for the overkill: §

  1. emerg:

    Emergencies – system is unusable.

  2. alert:

    Action must be taken immediately [but system is still usable].

  3. crit:

    Critical Conditions [but action need not be taken immediately].

    • "socket: Failed to get a socket, exiting child"
  4. error:

    Error conditions [but not critical].

    • "Premature end of script headers"
  5. warn:

    Warning conditions. [close to error, but not error]

  6. notice:

    Normal but significant [notable] condition.

    • "httpd: caught SIGBUS, attempting to dump core in ..."
  7. info:

    Informational [and unnotable].

    • ["Server has been running for x hours."]
  8. debug:

    Debug-level messages [, i.e. messages logged for the sake of de-bugging)].

    • "Opening config file ..."
  9. trace1trace6:

    Trace messages [, i.e. messages logged for the sake of tracing].

    • "proxy: FTP: control connection complete"
    • "proxy: CONNECT: sending the CONNECT request to the remote proxy"
    • "openssl: Handshake: start"
    • "read from buffered SSL brigade, mode 0, 17 bytes"
    • "map lookup FAILED: map=rewritemap key=keyname"
    • "cache lookup FAILED, forcing new map lookup"
  10. trace7trace8:

    Trace messages, dumping large amounts of data

    • "| 0000: 02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 |"
    • "| 0000: 02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 |"

Apache commons-logging: §

  1. fatal:

    Severe errors that cause premature termination. Expect these to be immediately visible on a status console.

  2. error:

    Other runtime errors or unexpected conditions. Expect these to be immediately visible on a status console.

  3. warn:

    Use of deprecated APIs, poor use of API, 'almost' errors, other runtime situations that are undesirable or unexpected, but not necessarily "wrong". Expect these to be immediately visible on a status console.

  4. info:

    Interesting runtime events (startup/shutdown). Expect these to be immediately visible on a console, so be conservative and keep to a minimum.

  5. debug:

    detailed information on the flow through the system. Expect these to be written to logs only.

  6. trace:

    more detailed information. Expect these to be written to logs only.

Apache commons-logging "best practices" for enterprise usage makes a distinction between debug and info based on what kind of boundaries they cross.

Boundaries include:

  • External Boundaries - Expected Exceptions.

  • External Boundaries - Unexpected Exceptions.

  • Internal Boundaries.

  • Significant Internal Boundaries.

(See commons-logging guide for more info on this.)

草莓酥 2024-08-23 23:00:06

我建议采用 Syslog 严重级别:DEBUG、INFO、NOTICE、WARNING、ERROR、CRITICAL、ALERT、EMERGENCY
请参阅 http://en.wikipedia.org/wiki/Syslog#Severity_levels

他们应该提供对于大多数用例来说有足够细粒度的严重性级别,并且可以被现有的日志解析器识别。当然,您可以自由地仅实现一个子集,例如DEBUG、ERROR、EMERGENCY,具体取决于您的应用程序的要求。

让我们对已经存在多年的东西进行标准化,而不是为我们制作的每个不同的应用程序制定我们自己的标准。一旦您开始聚合日志并尝试检测不同日志中的模式,这确实很有帮助。

I'd recommend adopting Syslog severity levels: DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL, ALERT, EMERGENCY.
See http://en.wikipedia.org/wiki/Syslog#Severity_levels

They should provide enough fine-grained severity levels for most use-cases and are recognized by existing log-parsers. While you have of course the freedom to only implement a subset, e.g. DEBUG, ERROR, EMERGENCY depending on your app's requirements.

Let's standardize on something that's been around for ages instead of coming up with our own standard for every different app we make. Once you start aggregating logs and are trying to detect patterns across different ones it really helps.

神妖 2024-08-23 23:00:06

如果您可以从问题中恢复,那么这是一个警告。如果它阻止继续执行,那么它就是一个错误。

If you can recover from the problem then it's a warning. If it prevents continuing execution then it's an error.

南烟 2024-08-23 23:00:06

您可以从中恢复的警告。你不能犯的错误。这是我的启发,其他人可能有其他想法。

例如,假设您在应用程序中输入/导入名称“Angela Müller”(请注意 u 上的元音变音)。您的代码/数据库可能只是英文(尽管在当今时代可能不应该是英文),因此可能会警告所有“不寻常”字符已转换为常规英文字符。

与此相比,尝试将该信息写入数据库并连续 60 秒返回网络故障消息。这与其说是警告,不如说是错误。

Warnings you can recover from. Errors you can't. That's my heuristic, others may have other ideas.

For example, let's say you enter/import the name "Angela Müller" into your application (note the umlaut over the u). Your code/database may be English only (though it probably shouldn't be in this day and age) and could therefore warn that all "unusual" characters had been converted to regular English characters.

Contrast that with trying to write that information to the database and getting back a network down message for 60 seconds straight. That's more of an error than a warning.

春风十里 2024-08-23 23:00:06

来自 RFC 5424,Syslog 协议 (IETF) - 第 10 页:

每条消息的优先级还有一个十进制严重级别指示器。
下表描述了这些及其数值
价值观。严重性值必须在 0 到 7 的范围内。

 数字严重性
         代码

          0 紧急:系统无法使用
          1 警报:必须立即采取行动
          2 Critical:临界条件
          3 错误:错误条件
          4 警告:警告条件
          5 注意:正常但重要的情况
          6 信息性:信息性消息
          7 调试:调试级别消息

          表 2. 系统日志消息严重性

From RFC 5424, the Syslog Protocol (IETF) - Page 10:

Each message Priority also has a decimal Severity level indicator.
These are described in the following table along with their numerical
values. Severity values MUST be in the range of 0 to 7 inclusive.

       Numerical         Severity
         Code

          0       Emergency: system is unusable
          1       Alert: action must be taken immediately
          2       Critical: critical conditions
          3       Error: error conditions
          4       Warning: warning conditions
          5       Notice: normal but significant condition
          6       Informational: informational messages
          7       Debug: debug-level messages

          Table 2. Syslog Message Severities
鸢与 2024-08-23 23:00:06

Taco Jan Osinga 的回答非常好,而且非常实用。

我部分同意他的观点,尽管有一些不同。

Python 上,只有 5 个“指定”日志记录级别,所以这就是我使用它们的方式:

  • DEBUG -- 对于故障排除很重要的信息,并且通常在正常的日常操作中被抑制
  • INFO -- day - 当天的操作作为程序正在按照设计执行其功能的“证据”
  • WARN - 超出标称但可恢复的情况,*或*遇到可能 导致将来出现问题
  • 错误——发生了一些事情,需要程序进行恢复,但恢复成功。不过,程序可能未处于最初预期的状态,因此程序的用户需要适应
  • “关键”--发生了无法恢复的事情,程序可能需要终止,以免每个人都活着处于罪恶状态

Taco Jan Osinga's answer is very good, and very practical, to boot.

I am in partial agreement with him, though with some variations.

On Python, there are only 5 "named" logging levels, so this is how I use them:

  • DEBUG -- information important for troubleshooting, and usually suppressed in normal day-to-day operation
  • INFO -- day-to-day operation as "proof" that program is performing its function as designed
  • WARN -- out-of-nominal but recoverable situation, *or* coming upon something that may result in future problems
  • ERROR -- something happened that necessitates the program to do recovery, but recovery is successful. Program is likely not in the originally expected state, though, so user of the program will need to adapt
  • CRITICAL -- something happened that cannot be recovered from, and program likely need to terminate lest everyone will be living in a state of sin
青芜 2024-08-23 23:00:06

有趣的是Microsoft如何定义不同的LogLevel 在新的“准标准”Microsoft.Extensions.Logging 中的值(强调我的):

严重

描述不可恢复的应用程序或系统崩溃的日志,或者
需要立即关注的灾难性故障。

错误

突出显示当前执行流程何时停止的日志
直至失败。这些应该表明当前活动失败,
不是应用程序范围内的故障

警告

突出显示应用程序中异常或意外事件的日志
流,但不要以其他方式导致应用程序执行停止。

信息

跟踪应用程序一般流程的日志。这些日志应该
具有长期价值

调试

用于开发期间交互式调查的日志。
这些日志应主要包含对调试有用的信息
并且没有长期价值

跟踪

包含最详细消息的日志。这些消息可能
包含敏感应用程序数据。这些消息被禁用
默认值,并且永远不要在生产环境中启用

It's interesting how Microsoft defines the different LogLevel values in their new "quasi-standard" Microsoft.Extensions.Logging (emphasis mine):

Critical

Logs that describe an unrecoverable application or system crash, or a
catastrophic failure that requires immediate attention.

Error

Logs that highlight when the current flow of execution is stopped due
to a failure. These should indicate a failure in the current activity,
not an application-wide failure.

Warning

Logs that highlight an abnormal or unexpected event in the application
flow, but do not otherwise cause the application execution to stop.

Information

Logs that track the general flow of the application. These logs should
have long-term value.

Debug

Logs that are used for interactive investigation during development.
These logs should primarily contain information useful for debugging
and have no long-term value.

Trace

Logs that contain the most detailed messages. These messages may
contain sensitive application data. These messages are disabled by
default and should never be enabled in a production environment.

允世 2024-08-23 23:00:06

来自https://sematext.com/blog/slf4j-tutorial/

  • TRACE – 此级别的日志事件是最细粒度的,通常不需要,除非您需要完全了解应用程序和第三方库中发生的情况你使用的。您可以预期 TRACE 日志记录级别会非常详细。
  • DEBUG – 与 TRACE 级别相比,粒度较小,但仍超出您日常使用的需要。 DEBUG 日志级别应用于更深入的诊断和故障排除可能需要的信息。
  • INFO – 标准日志级别,指示发生了某些事情、应用程序处理了请求等。使用 INFO 日志级别记录的信息应该纯粹是提供信息的,不应定期查看它们不会导致丢失任何重要信息。
  • WARN – 指示应用程序中发生意外情况的日志级别。例如,一个问题或一种情况可能会干扰其中一个进程,但整个应用程序仍在运行。
  • 错误 – 当应用程序遇到阻止一项或多项功能正常运行的问题时应使用的日志级别。当其中一种支付系统不可用时,可以使用错误日志级别,但仍然可以选择在电子商务应用程序中查看购物篮,或者当您的社交媒体日志记录选项由于某种原因无法正常工作时。您还可以查看与异常相关的错误日志级别。

From https://sematext.com/blog/slf4j-tutorial/:

  • TRACE – log events with this level are the most fine-grained and are usually not needed unless you need to have the full visibility of what is happening in your application and inside the third-party libraries that you use. You can expect the TRACE logging level to be very verbose.
  • DEBUG – less granular compared to the TRACE level, but still more than you will need in everyday use. The DEBUG log level should be used for information that may be needed for deeper diagnostics and troubleshooting.
  • INFO – the standard log level indicating that something happened, application processed a request, etc. The information logged using the INFO log level should be purely informative and not looking into them on a regular basis shouldn’t result in missing any important information.
  • WARN – the log level that indicates that something unexpected happened in the application. For example a problem, or a situation that might disturb one of the processes, but the whole application is still working.
  • ERROR – the log level that should be used when the application hits an issue preventing one or more functionalities from properly functioning. The ERROR log level can be used when one of the payment systems is not available, but there is still the option to check out the basket in the e-commerce application or when your social media logging option is not working for some reason. You can also see the ERROR log level associated with exceptions.
当梦初醒 2024-08-23 23:00:06

正如其他人所说,错误就是问题;错误就是问题。警告是潜在的问题。

在开发过程中,我经常使用警告,我可能会在其中放置相当于断言失败的警告,但应用程序可以继续工作;这使我能够查明这种情况是否真的发生过,或者这是否是我的想象。

但是,是的,这取决于可恢复性和现实性方面。如果你能恢复,这可能是一个警告;如果它导致某些事情实际上失败,那么它就是一个错误。

As others have said, errors are problems; warnings are potential problems.

In development, I frequently use warnings where I might put the equivalent of an assertion failure but the application can continue working; this enables me to find out if that case ever actually happens, or if it's my imagination.

But yes, it gets down to the recoverabilty and actuality aspects. If you can recover, it's probably a warning; if it causes something to actually fail, it's an error.

吹泡泡o 2024-08-23 23:00:06

我完全同意其他人的观点,并且认为 GrayWizardx 说得最好。

我能补充的是,这些级别通常对应于它们的字典定义,所以它不会那么难。如果有疑问,请将其视为拼图。对于您的特定项目,请考虑您可能想要记录的所有内容。

现在,你能弄清楚什么可能是致命的吗?你知道致命意味着什么,不是吗?那么,你的清单上的哪些项目是致命的。

好的,这是致命的处理,现在让我们看看错误......冲洗并重复。

在“致命”或“错误”之下,我建议更多的信息总是比更少的信息好,所以错误地“向上”。不确定这是信息还是警告?然后将其设为警告。

我确实认为我们所有人都应该清楚致命和错误。其他的可能比较模糊,但可以说让它们正确并不那么重要。

以下是一些示例:

致命 - 无法分配内存、数据库等 - 无法继续。

错误 - 没有回复消息、事务中止、无法保存文件等。

警告 - 资源分配达到 X%(比如 80%) - 这是您可能想要重新调整维度的信号。

信息 - 用户登录/退出、新交易、文件打包、新 d/b 字段或已删除字段。< br>

调试 - 内部数据结构的转储,任何带有文件名和跟踪级别的内容行号。
跟踪 - 操作成功/失败,d/b 更新。

I totally agree with the others, and think that GrayWizardx said it best.

All that I can add is that these levels generally correspond to their dictionary definitions, so it can't be that hard. If in doubt, treat it like a puzzle. For your particular project, think of everything that you might want to log.

Now, can you figure out what might be fatal? You know what fatal means, don't you? So, which items on your list are fatal.

Ok, that's fatal dealt with, now let's look at errors ... rinse and repeat.

Below Fatal, or maybe Error, I would suggest that more information is always better than less, so err "upwards". Not sure if it's Info or Warning? Then make it a warning.

I do think that Fatal and error ought to be clear to all of us. The others might be fuzzier, but it is arguably less vital to get them right.

Here are some examples:

Fatal - can't allocate memory, database, etc - can't continue.

Error - no reply to message, transaction aborted, can't save file, etc.

Warning - resource allocation reaches X% (say 80%) - that is a sign that you might want to re-dimension your.

Info - user logged in/out, new transaction, file crated, new d/b field, or field deleted.

Debug - dump of internal data structure, Anything Trace level with file name & line number.
Trace - action succeeded/failed, d/b updated.

一萌ing 2024-08-23 23:00:06

我认为 SYSLOG 级别 NOTICE 和 ALERT/EMERGENCY 对于应用程序级日志记录来说基本上是多余的 - 虽然 CRITICAL/ALERT/EMERGENCY 对于可能触发不同操作和通知的操作员来说可能是有用的警报级别,但对于应用程序管理员来说,它与致命的。我只是无法充分区分收到通知或某些信息。如果这些信息不值得注意,那么它就不是真正的信息:)

我最喜欢 Jay Cincotta 的解释 - 跟踪代码的执行在技术支持中非常有用,并且应该鼓励将跟踪语句自由地放入代码中 - 特别是与动态结合使用用于记录来自特定应用程序组件的跟踪消息的过滤机制。然而,对我来说,DEBUG 级别表明我们仍在弄清楚发生了什么 - 我将 DEBUG 级别输出视为仅用于开发的选项,而不是应该显示在生产日志中的内容。

然而,当我担任系统管理员、技术支持人员甚至开发人员时,我喜欢在错误日志中看到一个日志记录级别:OPER,用于操作消息。我用它来记录时间戳、调用的操作类型、提供的参数、可能的(唯一的)任务标识符和任务完成情况。例如,当独立任务被触发时使用它,这是来自较大的长时间运行的应用程序中的真正调用。这是我希望始终记录的事情,无论是否出现问题,因此我认为 OPER 的级别高于 FATAL,因此您只能通过进入完全静音模式来将其关闭。它不仅仅是 INFO 日志数据 - 该日志级别经常被滥用于垃圾邮件日志,其中包含没有任何历史价值的次要操作消息。

根据情况指示,该信息可以被定向到单独的调用日志,或者可以通过从记录更多信息的大日志中过滤它来获得。但作为历史信息,总是需要知道正在做什么 - 而无需下降到审计级别,这是另一个完全独立的日志级别,与故障或系统操作无关,并不真正适合上述级别(因为它需要自己的控制开关,而不是严重性分类)并且肯定需要自己单独的日志文件。

I think that SYSLOG levels NOTICE and ALERT/EMERGENCY are largely superfluous for application-level logging - while CRITICAL/ALERT/EMERGENCY may be useful alert levels for an operator that may trigger different actions and notifications, to an application admin it's all the same as FATAL. And I just cannot sufficiently distinguish between being given a notice or some information. If the information is not noteworthy, it's not really information :)

I like Jay Cincotta's interpretation best - tracing your code's execution is something very useful in tech support, and putting trace statements into the code liberally should be encouraged - especially in combination with a dynamic filtering mechanism for logging the trace messages from specific application components. However DEBUG level to me indicates that we're still in the process of figuring out what's going on - I see DEBUG level output as a development-only option, not as something that should ever show up in a production log.

There is however a logging level that I like to see in my error logs when wearing the hat of a sysadmin as much as that of tech support, or even developer: OPER, for OPERATIONAL messages. This I use for logging a timestamp, the type of operation invoked, the arguments supplied, possibly a (unique) task identifier, and task completion. It's used when e.g. a standalone task is fired off, something that is a true invocation from within the larger long-running app. It's the sort of thing I want always logged, no matter whether anything goes wrong or not, so I consider the level of OPER to be higher than FATAL, so you can only turn it off by going to totally silent mode. And it's much more than mere INFO log data - a log level often abused for spamming logs with minor operational messages of no historical value whatsoever.

As the case dictates this information may be directed to a separate invocation log, or may be obtained by filtering it out of a large log recording more information. But it's always needed, as historical info, to know what was being done - without descending to the level of AUDIT, another totally separate log level that has nothing to do with malfunctions or system operation, doesn't really fit within the above levels (as it needs its own control switch, not a severity classification) and which definitely needs its own separate log file.

谈情不如逗狗 2024-08-23 23:00:06

您好,

作为这个问题的必然结果,请传达您对日志级别的解释,并确保项目中的所有人员都对级别的解释保持一致。

看到各种各样的日志消息,其中严重性和所选日志级别不一致,这是令人痛苦的。

如果可能,请提供不同日志记录级别的示例。并在消息中记录的信息保持一致。

华泰

G'day,

As a corollary to this question, communicate your interpretations of the log levels and make sure that all people on a project are aligned in their interpretation of the levels.

It's painful to see a vast variety of log messages where the severities and the selected log levels are inconsistent.

Provide examples if possible of the different logging levels. And be consistent in the info to be logged in a message.

HTH

空名 2024-08-23 23:00:06

我对 FATAL 和 TRACE 错误日志级别的两分钱。

ERROR 是指发生某些 FAULT(异常)时。

FATAL 实际上是 DOUBLE FAULT:当处理异常时发生异常时。

对于Web服务来说很容易理解。

  1. 请求来了。事件记录为 INFO
  2. 系统检测到磁盘空间不足。事件被记录为 WARN 调用
  3. 某些函数来处理请求。
    处理时发生除零。事件被记录为ERROR
  4. Web 服务的异常处理程序被调用来处理除零的情况。
    Web 服务/框架将发送电子邮件,但无法发送电子邮件,因为邮件服务现在已离线。这第二个异常无法正常处理,因为Web服务的异常处理程序无法处理异常。
  5. 调用不同的异常处理程序。事件记录为 FATAL

TRACE 是我们可以跟踪函数进入/退出的时间。这与日志记录无关,因为此消息可以由某些调试器生成,并且您的代码根本没有调用 log 。因此,不是来自您的应用程序的消息会被标记为 TRACE 级别。例如,您通过 strace 运行应用程序,

因此通常在您的程序中执行 DEBUGINFOWARN 日志记录。仅当您正在编写一些 Web 服务/框架时,您才会使用 FATAL。当您调试应用程序时,您将从此类软件中获得 TRACE 日志记录。

My two cents about FATAL and TRACE error log levels.

ERROR is when some FAULT (exception) occur.

FATAL is actually DOUBLE FAULT: when exception occur while handling exception.

It's easy to understand for web service.

  1. Request come. Event is logged as INFO
  2. System detects low disk space. Event is logged as WARN
  3. Some function is called to handle the request.
    While processing division by zero occur. Event is logged as ERROR
  4. Web service's exception handler is called to handle division by zero.
    Web service/framework is going to send email, but it can not because mailing service is offline now. This second exception can not be handled normally, because Web service's exception handler can not process exception.
  5. Different exception handler called. Event is logged as FATAL

TRACE is when we can trace function entry/exit. This is not about logging, because this message can be generated by some debugger and your code has not call to log at all. So messages that are not from your application are marked like TRACE level. For example your run your application by with strace

So generally in your program you do DEBUG, INFO and WARN logging. And only if you are writing some web service/framework you will use FATAL. And when you are debugging application you will get TRACE logging from this type of software.

ぇ气 2024-08-23 23:00:06

错误是指错误的事情,明显的错误,没有办法解决,需要修复。

警告是一种模式的标志,该模式可能是错误的,但也可能不是。

话虽如此,我无法想出一个警告不是错误的好例子。我的意思是,如果您不厌其烦地记录警告,那么您不妨解决根本问题。

然而,像“sql执行时间太长”这样的事情可能是一个警告,而“sql执行死锁”是一个错误,所以也许毕竟有一些情况。

An error is something that is wrong, plain wrong, no way around it, it needs to be fixed.

A warning is a sign of a pattern that might be wrong, but then also might not be.

Having said that, I cannot come up with a good example of a warning that isn't also an error. What I mean by that is that if you go to the trouble of logging a warning, you might as well fix the underlying issue.

However, things like "sql execution takes too long" might be a warning, while "sql execution deadlocks" is an error, so perhaps there's some cases after all.

污味仙女 2024-08-23 23:00:06

我一直考虑警告第一个日志级别,这肯定意味着存在问题(例如,配置文件可能不在它应该在的位置,我们将不得不使用默认设置运行)。对我来说,错误意味着软件的主要目标现在不可能实现,我们将尝试彻底关闭。

I've always considered warning the first log level that for sure means there is a problem (for example, perhaps a config file isn't where it should be and we're going to have to run with default settings). An error implies, to me, something that means the main goal of the software is now impossible and we're going to try to shut down cleanly.

把人绕傻吧 2024-08-23 23:00:06

顺便说一句,我非常喜欢捕捉一切并稍后过滤信息。

如果您在警告级别捕获并想要一些与警告相关的调试信息,但无法重新创建警告,会发生什么情况?

捕获一切并稍后过滤!

即使对于嵌入式软件也是如此,除非您发现您的处理器无法跟上,在这种情况下您可能需要重新设计跟踪以使其更高效,或者跟踪会干扰时序(您可能会 考虑在更强大的处理器上进行调试,但这会带来另一种蠕虫病毒)。

捕获一切并稍后过滤!!

(顺便说一句,捕获所有内容也很好,因为它可以让您开发工具来执行更多操作,而不仅仅是显示调试跟踪(我从我的中绘制消息序列图和内存使用情况的直方图。如果出现问题,它还为您提供了比较的基础)未来(保留所有日志,无论是通过还是失败,并确保在日志文件中包含内部版本号))。

Btw, I am a great fan of capturing everything and filtering the information later.

What would happen if you were capturing at Warning level and want some Debug info related to the warning, but were unable to recreate the warning?

Capture everything and filter later!

This holds true even for embedded software unless you find that your processor can't keep up, in which case you might want to re-design your tracing to make it more efficient, or the tracing is interfering with timing (you might consider debugging on a more powerful processor, but that opens up a whole nother can of worms).

Capture everything and filter later!!

(btw, capture everything is also good because it lets you develop tools to do more than just show debug trace (I draw Message Sequence Charts from mine, and histograms of memory usage. It also gives you a basis for comparison if something goes wrong in future (keep all logs, whether pass or fail, and be sure to include build number in the log file)).

乜一 2024-08-23 23:00:06

我之前已经构建过使用以下内容的系统:

  1. 错误 - 意味着出现严重错误并且特定线程/进程/序列无法继续。需要一些用户/管理员干预
  2. 警告 - 有些事情不对劲,但该过程可以像以前一样继续进行(例如,一组 100 个作业中的一个作业失败,但其余的可以处理)

在我构建的系统中,管理员是根据指示对错误做出反应。另一方面,我们会观察警告并确定每种情况是否需要任何系统更改、重新配置等。

I've built systems before that use the following:

  1. ERROR - means something is seriously wrong and that particular thread/process/sequence can't carry on. Some user/admin intervention is required
  2. WARNING - something is not right, but the process can carry on as before (e.g. one job in a set of 100 has failed, but the remainder can be processed)

In the systems I've built admins were under instruction to react to ERRORs. On the other hand we would watch for WARNINGS and determine for each case whether any system changes, reconfigurations etc. were required.

轻拂→两袖风尘 2024-08-23 23:00:06

我建议仅使用三个级别

  1. Fatal - 这会破坏应用程序。
  2. 信息 - 信息
  3. 调试 - 不太重要的信息

I suggest using only three levels

  1. Fatal - Which would break the application.
  2. Info - Info
  3. Debug - Less important info
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文