何时使用不同的日志级别
记录消息的方法有多种,按严重程度排列:
FATAL
>ERROR
警告
信息
调试
TRACE
我如何决定何时使用哪个?
使用什么是好的启发法?
There are different ways to log messages, in order of fatality:
FATAL
ERROR
WARN
INFO
DEBUG
TRACE
How do I decide when to use which?
What's a good heuristic to use?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(23)
我通常遵循以下约定:
I generally subscribe to the following convention:
您希望该消息能够让系统管理员在半夜起床吗?
Would you want the message to get a system administrator out of bed in the middle of the night?
这是一个古老的话题,但仍然具有现实意义。本周,我为我的同事写了一篇关于它的小文章。为此,我还创建了这个备忘单,因为我在网上找不到任何备忘单。
It's an old topic, but still relevant. This week, I wrote a small article about it, for my colleagues. For that purpose, I also created this cheat sheet, because I couldn't find any online.
我发现从查看日志文件的角度考虑严重性更有帮助。
致命/严重</strong>:应立即调查的整体应用程序或系统故障。是的,唤醒系统管理员。由于我们更喜欢系统管理员的警觉和休息良好,因此这种严重程度应该很少使用。如果它每天都在发生,而且不是 BFD,那么它就失去了意义。通常,致命错误在进程生命周期中仅发生一次,因此如果日志文件与进程绑定,则这通常是日志中的最后一条消息。
错误:绝对是一个应该调查的问题。系统管理员应该会自动收到通知,但不需要被拖下床。通过过滤日志来查看错误及上述错误,您可以大致了解错误频率,并可以快速识别可能导致一系列其他错误的初始故障。跟踪错误率与应用程序使用情况可以产生有用的质量指标,例如可用于评估整体质量的 MTBF。例如,此指标可能有助于决定在发布之前是否需要另一个 Beta 测试周期。
警告:这可能是问题,也可能不是。例如,预期的瞬态环境条件(例如网络或数据库连接短暂丢失)应记录为警告,而不是错误。查看过滤后仅显示警告和错误的日志可以快速洞察后续错误根本原因的早期提示。应谨慎使用警告,以免变得毫无意义。例如,网络访问丢失应该是服务器应用程序中的警告甚至错误,但可能只是专为偶尔断开连接的笔记本电脑用户设计的桌面应用程序中的信息。
信息:这是在正常情况下应记录的重要信息,例如成功初始化、服务启动和停止或成功完成重要事务。查看显示信息及以上内容的日志应该可以快速概述流程中的主要状态更改,提供顶级上下文以了解也发生的任何警告或错误。不要有太多信息消息。我们通常有< 5% 与 Trace 相关的信息消息。
跟踪:跟踪是迄今为止最常用的严重性,应提供上下文以了解导致错误和警告的步骤。拥有正确的 Trace 消息密度可以使软件更易于维护,但需要一些努力,因为随着程序的发展,各个 Trace 语句的值可能会随着时间而变化。实现这一目标的最佳方法是让开发团队养成定期检查日志的习惯,作为解决客户报告的问题的标准部分。鼓励团队删除不再提供有用上下文的跟踪消息,并在需要时添加消息以了解后续消息的上下文。例如,记录用户输入(例如更改显示或选项卡)通常很有帮助。
调试:我们认为调试痕迹。区别在于调试消息是从发布版本中编译出来的。也就是说,我们不鼓励使用调试消息。允许调试消息往往会导致添加越来越多的调试消息,但不会删除任何消息。随着时间的推移,这使得日志文件几乎毫无用处,因为从噪声中过滤信号太困难了。这导致开发人员不使用继续死亡螺旋的日志。相反,不断修剪跟踪消息会鼓励开发人员使用它们,从而形成良性循环。此外,这还消除了由于调试代码中未包含在发布版本中所需的副作用而引入错误的可能性。是的,我知道在好的代码中不应该发生这种情况,但安全总比后悔好。
I find it more helpful to think about severities from the perspective of viewing the log file.
Fatal/Critical: Overall application or system failure that should be investigated immediately. Yes, wake up the SysAdmin. Since we prefer our SysAdmins alert and well-rested, this severity should be used very infrequently. If it's happening daily and that's not a BFD, it has lost its meaning. Typically, a Fatal error only occurs once in the process lifetime, so if the log file is tied to the process, this is typically the last message in the log.
Error: Definitely a problem that should be investigated. SysAdmin should be notified automatically, but doesn't need to be dragged out of bed. By filtering a log to look at errors and above you get an overview of error frequency and can quickly identify the initiating failure that might have resulted in a cascade of additional errors. Tracking error rates as versus application usage can yield useful quality metrics such as MTBF which can be used to assess overall quality. For example, this metric might help inform decisions about whether or not another beta testing cycle is needed before a release.
Warning: This MIGHT be problem, or might not. For example, expected transient environmental conditions such as short loss of network or database connectivity should be logged as Warnings, not Errors. Viewing a log filtered to show only warnings and errors may give quick insight into early hints at the root cause of a subsequent error. Warnings should be used sparingly so that they don't become meaningless. For example, loss of network access should be a warning or even an error in a server application, but might be just an Info in a desktop app designed for occasionally disconnected laptop users.
Info: This is important information that should be logged under normal conditions such as successful initialization, services starting and stopping or successful completion of significant transactions. Viewing a log showing Info and above should give a quick overview of major state changes in the process providing top-level context for understanding any warnings or errors that also occur. Don't have too many Info messages. We typically have < 5% Info messages relative to Trace.
Trace: Trace is by far the most commonly used severity and should provide context to understand the steps leading up to errors and warnings. Having the right density of Trace messages makes software much more maintainable but requires some diligence because the value of individual Trace statements may change over time as programs evolve. The best way to achieve this is by getting the dev team in the habit of regularly reviewing logs as a standard part of troubleshooting customer reported issues. Encourage the team to prune out Trace messages that no longer provide useful context and to add messages where needed to understand the context of subsequent messages. For example, it is often helpful to log user input such as changing displays or tabs.
Debug: We consider Debug < Trace. The distinction being that Debug messages are compiled out of Release builds. That said, we discourage use of Debug messages. Allowing Debug messages tends to lead to more and more Debug messages being added and none ever removed. In time, this makes log files almost useless because it's too hard to filter signal from noise. That causes devs to not use the logs which continues the death spiral. In contrast, constantly pruning Trace messages encourages devs to use them which results in a virtuous spiral. Also, this eliminates the possibility of bugs introduced because of needed side-effects in debug code that isn't included in the release build. Yeah, I know that shouldn't happen in good code, but better safe than sorry.
这是“伐木者”所拥有的清单。
Apache log4j:§1,< a href="https://logging.apache.org/log4j/2.0/log4j-api/apidocs/org/apache/logging/log4j/Level.html">§2
致命
:<块引用>
[v1.2:..]非常严重的错误事件,可能会导致应用程序中止。
[v2.0:..]严重错误,将阻止应用程序继续运行。
错误
:<块引用>
[v1.2:..]可能仍允许应用程序继续运行的错误事件。
[v2.0:..]应用程序中出现错误,可能可以恢复。
警告
:<块引用>
[v1.2:..]潜在的有害情况。
[v2.0:..]可能[sic]导致错误的事件。
信息
:<块引用>
[v1.2:..]在粗粒度级别突出显示应用程序进度的信息性消息。
[v2.0:..] 仅供参考的事件。
调试
:<块引用>
[v1.2:..]对于调试应用程序最有用的细粒度信息事件。
[v2.0:..]常规调试事件。
跟踪
:<块引用>
[v1.2:..]比
DEBUG
更细粒度的信息事件。[v2.0:..]细粒度的调试消息,通常捕获通过应用程序的流程。
Apache Httpd(像往常一样)喜欢过度杀伤:§< /a>
紧急:
<块引用>
紧急情况 - 系统无法使用。
警报:
<块引用>
必须立即采取行动[但系统仍然可用]。
暴击:
<块引用>
严重情况[但无需立即采取行动]。
错误:
<块引用>
错误情况[但不严重]。
警告:
<块引用>
警告条件。 [接近错误,但不是错误]
注意:
<块引用>
正常但重要的[值得注意]情况。
SIGBUS
,尝试将核心转储到...”信息:
<块引用>
信息性[且不值得注意]。
调试:
<块引用>
调试级消息[,即为了调试而记录的消息)]。
trace1 → trace6:
<块引用>
跟踪消息[,即为了跟踪而记录的消息]。
map=rewritemap
key=keyname
”trace7 → trace8:
<块引用>
跟踪消息,转储大量数据
| 0000: 02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 |
”| 0000:02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 |
”Apache commons-logging: §
致命:
<块引用>
导致提前终止的严重错误。预计这些将立即在状态控制台上可见。
错误:
<块引用>
其他运行时错误或意外情况。预计这些将立即在状态控制台上可见。
警告:
<块引用>
使用已弃用的 API、API 使用不当、“几乎”错误、其他不希望或意外但不一定“错误”的运行时情况。预计这些将立即在状态控制台上可见。
信息:
<块引用>
有趣的运行时事件(启动/关闭)。预计这些会立即在控制台上可见,因此要保守并保持在最低限度。
调试:
<块引用>
有关系统流程的详细信息。期望这些仅写入日志。
跟踪:
<块引用>
更详细的信息。期望这些仅写入日志。
Apache commons-logging 企业使用的“最佳实践”根据调试和信息跨越的边界类型对它们进行了区分。
边界包括:
外部边界 - 预期异常。
外部边界 - 意外异常。
内部边界。
重要的内部边界。
(有关详细信息,请参阅commons-logging 指南。)
Here's a list of what "the loggers" have.
Apache log4j: §1, §2
FATAL
:ERROR
:WARN
:INFO
:DEBUG
:TRACE
:Apache Httpd (as usual) likes to go for the overkill: §
emerg:
alert:
crit:
error:
warn:
notice:
info:
debug:
trace1 → trace6:
trace7 → trace8:
Apache commons-logging: §
fatal:
error:
warn:
info:
debug:
trace:
Apache commons-logging "best practices" for enterprise usage makes a distinction between debug and info based on what kind of boundaries they cross.
Boundaries include:
External Boundaries - Expected Exceptions.
External Boundaries - Unexpected Exceptions.
Internal Boundaries.
Significant Internal Boundaries.
(See commons-logging guide for more info on this.)
我建议采用 Syslog 严重级别:
DEBUG、INFO、NOTICE、WARNING、ERROR、CRITICAL、ALERT、EMERGENCY
。请参阅 http://en.wikipedia.org/wiki/Syslog#Severity_levels
他们应该提供对于大多数用例来说有足够细粒度的严重性级别,并且可以被现有的日志解析器识别。当然,您可以自由地仅实现一个子集,例如
DEBUG、ERROR、EMERGENCY
,具体取决于您的应用程序的要求。让我们对已经存在多年的东西进行标准化,而不是为我们制作的每个不同的应用程序制定我们自己的标准。一旦您开始聚合日志并尝试检测不同日志中的模式,这确实很有帮助。
I'd recommend adopting Syslog severity levels:
DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL, ALERT, EMERGENCY
.See http://en.wikipedia.org/wiki/Syslog#Severity_levels
They should provide enough fine-grained severity levels for most use-cases and are recognized by existing log-parsers. While you have of course the freedom to only implement a subset, e.g.
DEBUG, ERROR, EMERGENCY
depending on your app's requirements.Let's standardize on something that's been around for ages instead of coming up with our own standard for every different app we make. Once you start aggregating logs and are trying to detect patterns across different ones it really helps.
如果您可以从问题中恢复,那么这是一个警告。如果它阻止继续执行,那么它就是一个错误。
If you can recover from the problem then it's a warning. If it prevents continuing execution then it's an error.
您可以从中恢复的警告。你不能犯的错误。这是我的启发,其他人可能有其他想法。
例如,假设您在应用程序中输入/导入名称“Angela Müller”(请注意
u
上的元音变音)。您的代码/数据库可能只是英文(尽管在当今时代可能不应该是英文),因此可能会警告所有“不寻常”字符已转换为常规英文字符。与此相比,尝试将该信息写入数据库并连续 60 秒返回网络故障消息。这与其说是警告,不如说是错误。
Warnings you can recover from. Errors you can't. That's my heuristic, others may have other ideas.
For example, let's say you enter/import the name
"Angela Müller"
into your application (note the umlaut over theu
). Your code/database may be English only (though it probably shouldn't be in this day and age) and could therefore warn that all "unusual" characters had been converted to regular English characters.Contrast that with trying to write that information to the database and getting back a network down message for 60 seconds straight. That's more of an error than a warning.
来自 RFC 5424,Syslog 协议 (IETF) - 第 10 页:
From RFC 5424, the Syslog Protocol (IETF) - Page 10:
Taco Jan Osinga 的回答非常好,而且非常实用。
我部分同意他的观点,尽管有一些不同。
在 Python 上,只有 5 个“指定”日志记录级别,所以这就是我使用它们的方式:
DEBUG
-- 对于故障排除很重要的信息,并且通常在正常的日常操作中被抑制INFO
-- day - 当天的操作作为程序正在按照设计执行其功能的“证据”WARN
- 超出标称但可恢复的情况,*或*遇到可能 导致将来出现问题错误
——发生了一些事情,需要程序进行恢复,但恢复成功。不过,程序可能未处于最初预期的状态,因此程序的用户需要适应Taco Jan Osinga's answer is very good, and very practical, to boot.
I am in partial agreement with him, though with some variations.
On Python, there are only 5 "named" logging levels, so this is how I use them:
DEBUG
-- information important for troubleshooting, and usually suppressed in normal day-to-day operationINFO
-- day-to-day operation as "proof" that program is performing its function as designedWARN
-- out-of-nominal but recoverable situation, *or* coming upon something that may result in future problemsERROR
-- something happened that necessitates the program to do recovery, but recovery is successful. Program is likely not in the originally expected state, though, so user of the program will need to adaptCRITICAL
-- something happened that cannot be recovered from, and program likely need to terminate lest everyone will be living in a state of sin有趣的是Microsoft如何定义不同的LogLevel 在新的“准标准”
Microsoft.Extensions.Logging
中的值(强调我的):It's interesting how Microsoft defines the different LogLevel values in their new "quasi-standard"
Microsoft.Extensions.Logging
(emphasis mine):来自https://sematext.com/blog/slf4j-tutorial/:
From https://sematext.com/blog/slf4j-tutorial/:
正如其他人所说,错误就是问题;错误就是问题。警告是潜在的问题。
在开发过程中,我经常使用警告,我可能会在其中放置相当于断言失败的警告,但应用程序可以继续工作;这使我能够查明这种情况是否真的发生过,或者这是否是我的想象。
但是,是的,这取决于可恢复性和现实性方面。如果你能恢复,这可能是一个警告;如果它导致某些事情实际上失败,那么它就是一个错误。
As others have said, errors are problems; warnings are potential problems.
In development, I frequently use warnings where I might put the equivalent of an assertion failure but the application can continue working; this enables me to find out if that case ever actually happens, or if it's my imagination.
But yes, it gets down to the recoverabilty and actuality aspects. If you can recover, it's probably a warning; if it causes something to actually fail, it's an error.
我完全同意其他人的观点,并且认为 GrayWizardx 说得最好。
我能补充的是,这些级别通常对应于它们的字典定义,所以它不会那么难。如果有疑问,请将其视为拼图。对于您的特定项目,请考虑您可能想要记录的所有内容。
现在,你能弄清楚什么可能是致命的吗?你知道致命意味着什么,不是吗?那么,你的清单上的哪些项目是致命的。
好的,这是致命的处理,现在让我们看看错误......冲洗并重复。
在“致命”或“错误”之下,我建议更多的信息总是比更少的信息好,所以错误地“向上”。不确定这是信息还是警告?然后将其设为警告。
我确实认为我们所有人都应该清楚致命和错误。其他的可能比较模糊,但可以说让它们正确并不那么重要。
致命 - 无法分配内存、数据库等 - 无法继续。
错误 - 没有回复消息、事务中止、无法保存文件等。
警告 - 资源分配达到 X%(比如 80%) - 这是您可能想要重新调整维度的信号。
信息 - 用户登录/退出、新交易、文件打包、新 d/b 字段或已删除字段。< br>
调试 - 内部数据结构的转储,任何带有文件名和跟踪级别的内容行号。
跟踪 - 操作成功/失败,d/b 更新。
I totally agree with the others, and think that GrayWizardx said it best.
All that I can add is that these levels generally correspond to their dictionary definitions, so it can't be that hard. If in doubt, treat it like a puzzle. For your particular project, think of everything that you might want to log.
Now, can you figure out what might be fatal? You know what fatal means, don't you? So, which items on your list are fatal.
Ok, that's fatal dealt with, now let's look at errors ... rinse and repeat.
Below Fatal, or maybe Error, I would suggest that more information is always better than less, so err "upwards". Not sure if it's Info or Warning? Then make it a warning.
I do think that Fatal and error ought to be clear to all of us. The others might be fuzzier, but it is arguably less vital to get them right.
Fatal - can't allocate memory, database, etc - can't continue.
Error - no reply to message, transaction aborted, can't save file, etc.
Warning - resource allocation reaches X% (say 80%) - that is a sign that you might want to re-dimension your.
Info - user logged in/out, new transaction, file crated, new d/b field, or field deleted.
Debug - dump of internal data structure, Anything Trace level with file name & line number.
Trace - action succeeded/failed, d/b updated.
我认为 SYSLOG 级别 NOTICE 和 ALERT/EMERGENCY 对于应用程序级日志记录来说基本上是多余的 - 虽然 CRITICAL/ALERT/EMERGENCY 对于可能触发不同操作和通知的操作员来说可能是有用的警报级别,但对于应用程序管理员来说,它与致命的。我只是无法充分区分收到通知或某些信息。如果这些信息不值得注意,那么它就不是真正的信息:)
我最喜欢 Jay Cincotta 的解释 - 跟踪代码的执行在技术支持中非常有用,并且应该鼓励将跟踪语句自由地放入代码中 - 特别是与动态结合使用用于记录来自特定应用程序组件的跟踪消息的过滤机制。然而,对我来说,DEBUG 级别表明我们仍在弄清楚发生了什么 - 我将 DEBUG 级别输出视为仅用于开发的选项,而不是应该显示在生产日志中的内容。
然而,当我担任系统管理员、技术支持人员甚至开发人员时,我喜欢在错误日志中看到一个日志记录级别:OPER,用于操作消息。我用它来记录时间戳、调用的操作类型、提供的参数、可能的(唯一的)任务标识符和任务完成情况。例如,当独立任务被触发时使用它,这是来自较大的长时间运行的应用程序中的真正调用。这是我希望始终记录的事情,无论是否出现问题,因此我认为 OPER 的级别高于 FATAL,因此您只能通过进入完全静音模式来将其关闭。它不仅仅是 INFO 日志数据 - 该日志级别经常被滥用于垃圾邮件日志,其中包含没有任何历史价值的次要操作消息。
根据情况指示,该信息可以被定向到单独的调用日志,或者可以通过从记录更多信息的大日志中过滤它来获得。但作为历史信息,总是需要知道正在做什么 - 而无需下降到审计级别,这是另一个完全独立的日志级别,与故障或系统操作无关,并不真正适合上述级别(因为它需要自己的控制开关,而不是严重性分类)并且肯定需要自己单独的日志文件。
I think that SYSLOG levels NOTICE and ALERT/EMERGENCY are largely superfluous for application-level logging - while CRITICAL/ALERT/EMERGENCY may be useful alert levels for an operator that may trigger different actions and notifications, to an application admin it's all the same as FATAL. And I just cannot sufficiently distinguish between being given a notice or some information. If the information is not noteworthy, it's not really information :)
I like Jay Cincotta's interpretation best - tracing your code's execution is something very useful in tech support, and putting trace statements into the code liberally should be encouraged - especially in combination with a dynamic filtering mechanism for logging the trace messages from specific application components. However DEBUG level to me indicates that we're still in the process of figuring out what's going on - I see DEBUG level output as a development-only option, not as something that should ever show up in a production log.
There is however a logging level that I like to see in my error logs when wearing the hat of a sysadmin as much as that of tech support, or even developer: OPER, for OPERATIONAL messages. This I use for logging a timestamp, the type of operation invoked, the arguments supplied, possibly a (unique) task identifier, and task completion. It's used when e.g. a standalone task is fired off, something that is a true invocation from within the larger long-running app. It's the sort of thing I want always logged, no matter whether anything goes wrong or not, so I consider the level of OPER to be higher than FATAL, so you can only turn it off by going to totally silent mode. And it's much more than mere INFO log data - a log level often abused for spamming logs with minor operational messages of no historical value whatsoever.
As the case dictates this information may be directed to a separate invocation log, or may be obtained by filtering it out of a large log recording more information. But it's always needed, as historical info, to know what was being done - without descending to the level of AUDIT, another totally separate log level that has nothing to do with malfunctions or system operation, doesn't really fit within the above levels (as it needs its own control switch, not a severity classification) and which definitely needs its own separate log file.
您好,
作为这个问题的必然结果,请传达您对日志级别的解释,并确保项目中的所有人员都对级别的解释保持一致。
看到各种各样的日志消息,其中严重性和所选日志级别不一致,这是令人痛苦的。
如果可能,请提供不同日志记录级别的示例。并在消息中记录的信息保持一致。
华泰
G'day,
As a corollary to this question, communicate your interpretations of the log levels and make sure that all people on a project are aligned in their interpretation of the levels.
It's painful to see a vast variety of log messages where the severities and the selected log levels are inconsistent.
Provide examples if possible of the different logging levels. And be consistent in the info to be logged in a message.
HTH
我对 FATAL 和 TRACE 错误日志级别的两分钱。
ERROR
是指发生某些 FAULT(异常)时。FATAL
实际上是 DOUBLE FAULT:当处理异常时发生异常时。对于Web服务来说很容易理解。
INFO
处理时发生除零。事件被记录为
ERROR
Web 服务/框架将发送电子邮件,但无法发送电子邮件,因为邮件服务现在已离线。这第二个异常无法正常处理,因为Web服务的异常处理程序无法处理异常。
FATAL
TRACE
是我们可以跟踪函数进入/退出的时间。这与日志记录无关,因为此消息可以由某些调试器生成,并且您的代码根本没有调用log
。因此,不是来自您的应用程序的消息会被标记为TRACE
级别。例如,您通过strace
运行应用程序,因此通常在您的程序中执行
DEBUG
、INFO
和WARN
日志记录。仅当您正在编写一些 Web 服务/框架时,您才会使用FATAL
。当您调试应用程序时,您将从此类软件中获得TRACE
日志记录。My two cents about
FATAL
andTRACE
error log levels.ERROR
is when some FAULT (exception) occur.FATAL
is actually DOUBLE FAULT: when exception occur while handling exception.It's easy to understand for web service.
INFO
WARN
While processing division by zero occur. Event is logged as
ERROR
Web service/framework is going to send email, but it can not because mailing service is offline now. This second exception can not be handled normally, because Web service's exception handler can not process exception.
FATAL
TRACE
is when we can trace function entry/exit. This is not about logging, because this message can be generated by some debugger and your code has not call tolog
at all. So messages that are not from your application are marked likeTRACE
level. For example your run your application by withstrace
So generally in your program you do
DEBUG
,INFO
andWARN
logging. And only if you are writing some web service/framework you will useFATAL
. And when you are debugging application you will getTRACE
logging from this type of software.错误是指错误的事情,明显的错误,没有办法解决,需要修复。
警告是一种模式的标志,该模式可能是错误的,但也可能不是。
话虽如此,我无法想出一个警告不是错误的好例子。我的意思是,如果您不厌其烦地记录警告,那么您不妨解决根本问题。
然而,像“sql执行时间太长”这样的事情可能是一个警告,而“sql执行死锁”是一个错误,所以也许毕竟有一些情况。
An error is something that is wrong, plain wrong, no way around it, it needs to be fixed.
A warning is a sign of a pattern that might be wrong, but then also might not be.
Having said that, I cannot come up with a good example of a warning that isn't also an error. What I mean by that is that if you go to the trouble of logging a warning, you might as well fix the underlying issue.
However, things like "sql execution takes too long" might be a warning, while "sql execution deadlocks" is an error, so perhaps there's some cases after all.
我一直考虑警告第一个日志级别,这肯定意味着存在问题(例如,配置文件可能不在它应该在的位置,我们将不得不使用默认设置运行)。对我来说,错误意味着软件的主要目标现在不可能实现,我们将尝试彻底关闭。
I've always considered warning the first log level that for sure means there is a problem (for example, perhaps a config file isn't where it should be and we're going to have to run with default settings). An error implies, to me, something that means the main goal of the software is now impossible and we're going to try to shut down cleanly.
顺便说一句,我非常喜欢捕捉一切并稍后过滤信息。
如果您在警告级别捕获并想要一些与警告相关的调试信息,但无法重新创建警告,会发生什么情况?
捕获一切并稍后过滤!
即使对于嵌入式软件也是如此,除非您发现您的处理器无法跟上,在这种情况下您可能需要重新设计跟踪以使其更高效,或者跟踪会干扰时序(您可能会 考虑在更强大的处理器上进行调试,但这会带来另一种蠕虫病毒)。
捕获一切并稍后过滤!!
(顺便说一句,捕获所有内容也很好,因为它可以让您开发工具来执行更多操作,而不仅仅是显示调试跟踪(我从我的中绘制消息序列图和内存使用情况的直方图。如果出现问题,它还为您提供了比较的基础)未来(保留所有日志,无论是通过还是失败,并确保在日志文件中包含内部版本号))。
Btw, I am a great fan of capturing everything and filtering the information later.
What would happen if you were capturing at Warning level and want some Debug info related to the warning, but were unable to recreate the warning?
Capture everything and filter later!
This holds true even for embedded software unless you find that your processor can't keep up, in which case you might want to re-design your tracing to make it more efficient, or the tracing is interfering with timing (you might consider debugging on a more powerful processor, but that opens up a whole nother can of worms).
Capture everything and filter later!!
(btw, capture everything is also good because it lets you develop tools to do more than just show debug trace (I draw Message Sequence Charts from mine, and histograms of memory usage. It also gives you a basis for comparison if something goes wrong in future (keep all logs, whether pass or fail, and be sure to include build number in the log file)).
我之前已经构建过使用以下内容的系统:
在我构建的系统中,管理员是根据指示对错误做出反应。另一方面,我们会观察警告并确定每种情况是否需要任何系统更改、重新配置等。
I've built systems before that use the following:
In the systems I've built admins were under instruction to react to ERRORs. On the other hand we would watch for WARNINGS and determine for each case whether any system changes, reconfigurations etc. were required.
我做了这个可编辑的图表,看看:
Github
https://github.com /kenllyacosta/logdiagramflowCodepen
https://codepen.io/ kenllyacosta/pen/rNRvVxYI made this editable diagram, take a look at:
Github
https://github.com/kenllyacosta/logdiagramflowCodepen
https://codepen.io/kenllyacosta/pen/rNRvVxY我建议仅使用三个级别
I suggest using only three levels