从 C 共享库传回错误字符串的好方法是线程安全的

发布于 2024-10-09 00:48:53 字数 209 浏览 0 评论 0原文

我正在编写一个供内部使用的 C 共享库（如果重要的话，我会将其 dlopen() 到 C++ 应用程序中）。共享库通过 JNI 模块加载（除其他外）一些 java 代码，这意味着我需要在应用程序中智能处理的 JVM 中可能会出现各种噩梦错误模式。此外，该库需要可重入。在这种情况下，是否有用于传递错误字符串的习惯用法，或者我是否坚持将错误映射到整数并使用 printfs 来调试？

谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤星 2024-10-16 00:48:53

我解决问题的方法与其他人的方法略有不同。他们没有错，只是我不得不努力解决这个问题的另一个方面。

AC API 需要提供数字错误代码，以便使用 API 的代码可以在适当的时候采取合理的措施从错误中恢复，并在不适当的时候传递它们。 errno.h 代码展示了良好的错误分类；事实上，如果您可以重用这些代码（或者只是传递它们，例如，如果所有错误最终都来自系统调用），请这样做。
- 不要复制 errno 本身。如果可能，请直接从可能失败的函数返回错误代码。如果不可能，请在状态对象上使用 GetLastError() 方法。你有一个状态对象，是吗？
如果您必须发明自己的代码（errno.h 代码不能解决问题），请提供一个类似于 strerror 的函数，将这些代码转换为人类可读的代码字符串。
- 翻译这些字符串可能合适，也可能不合适。如果它们仅供开发人员阅读，请不要打扰。但如果您需要向最终用户展示它们，那么您需要翻译它们。
- 这些字符串的未翻译版本确实应该只是字符串常量，因此您不会遇到分配问题。但是，不要浪费时间和精力编码您自己的翻译基础设施。使用GNU gettext。
如果您的代码位于另一段代码之上，那么您提供对该代码生成的所有错误信息和相关上下文信息的直接访问至关重要，并且您可以让开发人员轻松应对您的代码将所有这些信息包装在为最终用户提供的错误消息中。
- 例如，如果您的库因系统调用失败而直接产生其自身设计的错误代码，则您的状态对象需要方法来返回系统调用失败后立即观察到的 errno 值，涉及的文件的名称（如果有），最好也是系统调用本身的名称。人们经常犯这个错误——例如，SQLite，否则是一个设计良好的 API，不会公开 errno 值或文件名，这使得很难区分“文件”数据库的权限错误”来自“您的代码中有错误”。

编辑： 附录：该领域的常见错误包括：

扭曲 API（例如使用输出参数），以便自然返回某些其他值的函数可以返回错误代码。
没有提供足够的详细信息，以便调用者能够生成错误消息，从而允许知识渊博的人员解决问题。（这个知识渊博的人可能不是最终用户。您的错误消息可能会出现在服务器日志文件或崩溃报告中，仅供开发人员查看。）
暴露错误之间的太多不同的细微差别。如果您的调用者永远不会合理地执行不同的操作来响应两个不同的错误代码，那么它们应该是相同的代码。
提供多个成功代码。这是要求微妙的错误。

另外，请仔细考虑哪些 API 应该允许失败。以下是一些永远不应该失败的事情：

只读数据访问器，尤其是那些返回标量的访问器，尤其那些返回布尔值的访问器。
最一般意义上的析构函数。（这是 UNIX 内核 API 中的一个经典错误：close 和 munmap 不应该失败。值得庆幸的是，至少 _exit 可以” t.)
有一种强烈的情况是，如果 malloc 失败，您应该立即调用 abort，而不是尝试将其传播给调用者。能够从事正确使用这两种方法的 C++ 项目。）

（由于异常和 RAII，这在 C++ 中并非如此——如果您很幸运 >错误，只需看看XPCOM。

My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.

A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
- Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
- It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
- The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
- For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".

EDIT: Addendum: common mistakes in this area include:

Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.

Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:

Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)

In closing: for an example of how to do just about everything wrong, look no further than XPCOM.

回复收藏 0 原文

心病无药医 2024-10-16 00:48:53

您返回指向static const char []对象的指针。这始终是处理错误字符串的正确方法。如果需要对它们进行本地化，则返回指向只读内存映射本地化字符串的指针。

回复收藏 0 原文

习惯成性 2024-10-16 00:48:53

在 C 中，如果您无需担心国际化 (I18N) 或本地化 (L10N)，那么指向常量数据的指针是提供错误消息字符串的好方法。然而，您经常发现错误消息需要一些支持信息（例如无法打开的文件的名称），而这些信息实际上无法通过常量数据来处理。

由于需要担心 I18N/L10N，我建议将每种语言的固定消息字符串存储在适当格式的文件中，然后使用 mmap() 在分叉任何线程之前将文件“读”到内存中。如此映射的区域应被视为只读（在调用 mmap() 中使用 PROT_READ）。

这避免了复杂的内存管理问题并避免内存泄漏。

考虑是否提供一个可以调用来获取最新错误的函数。它可以有一个原型，例如：

int get_error(int errnum, char *buffer, size_t buflen);

我假设错误号是由其他函数调用返回的；然后，库函数查询它拥有的有关当前线程和返回到该线程的最后一个错误条件的任何线程安全内存，并将适当的错误消息（可能被截断）格式化到给定的缓冲区中。

使用 C++，您可以从错误报告机制返回（引用）一个标准字符串；这意味着您可以格式化字符串以包含文件名或其他动态属性。收集信息的代码将负责释放字符串，由于 C++ 具有析构函数，这不是（不应该）成为问题。您可能仍想使用 mmap() 加载消息的格式字符串。

您确实需要小心加载的文件，特别是用作格式字符串的任何字符串。（此外，如果您正在处理 I18N/L10N，您需要担心是否使用 'n$ 表示法来允许参数重新排序；并且您必须担心不同文化的不同规则/关于句子单词出现顺序的语言。）

In C, if you don't have internationalization (I18N) or localization (L10N) to worry about, then pointers to constant data is a good way to supply error message strings. However, you often find that the error messages need some supporting information (such as the name of the file that could not be opened), which cannot really be handled by constant data.

With I18N/L10N to worry about, I'd recommend storing the fixed message strings for each language in an appropriately formatted file, and then using mmap() to 'read' the file into memory before you fork any threads. The area so mapped should then be treated as read-only (use PROT_READ in the call to mmap()).

This avoids complicated issues of memory management and avoids memory leaks.

Consider whether to provide a function that can be called to get the latest error. It can have a prototype such as:

int get_error(int errnum, char *buffer, size_t buflen);

I'm assuming that the error number is returned by some other function call; the library function then consults any threadsafe memory it has about the current thread and the last error condition returned to that thread, and formats an appropriate error message (possibly truncated) into the given buffer.

With C++, you can return (a reference to) a standard String from the error reporting mechanism; this means you can format the string to include the file name or other dynamic attributes. The code that collects the information will be responsible for releasing the string, which isn't (shouldn't be) a problem because of the destructors that C++ has. You might still want to use mmap() to load the format strings for the messags.

You do need to be careful about the files you load and, in particular, any strings used as format strings. (Also, if you are dealing with I18N/L10N, you need to worry about whether to use the 'n$ notation to allow for argument reordering; and you have to worry about different rules for different cultures/languages about the order in which the words of a sentence are presented.)

回复收藏 0 原文

但可醉心 2024-10-16 00:48:53

我猜你可以像 Windows 一样使用 PWideChars。它的线程安全。您需要的是调用应用程序创建一个 PwideChar，Dll 将使用它来设置错误。然后，调用应用程序需要读取该 PWideChar 并释放其内存。

回复收藏 0 原文

拒绝两难 2024-10-16 00:48:53

R. 有一个很好的答案（使用 static const char []），但是如果您要使用各种口语，我喜欢使用 Enum 来定义错误代码。这比一些 #define 将一堆名称定义为一个 int 值要好。

回复收藏 0 原文

对你的占有欲 2024-10-16 00:48:53

返回整数，不要设置某些全局变量（例如 errno - 即使它可能由实现进行 TLS 处理）；遵循 Linux 内核的 return -ENOENT; 风格。
有一个类似于 strerror 的函数，它接受这样一个整数并返回一个指向 const 字符串的指针。如果需要，此函数也可以透明地执行 I18N，因为 gettext 可返回字符串在翻译数据库的生命周期内也保持不变。

回复收藏 0 原文

一页 2024-10-16 00:48:53

如果您需要提供非静态错误消息，那么我建议返回如下字符串：error_code_t function(, char** err_msg)。然后提供一个函数来释放错误消息：void free_error_message(char* err_msg)。通过这种方式，您可以隐藏错误字符串的分配和释放方式。当然，只有错误字符串本质上是动态的才值得实现，这意味着它们传达的不仅仅是错误代码的翻译。

请注意 mu 格式。我是用手机写的...

回复收藏 0 原文

~没有更多了~