不允许字符串文字作为非类型模板参数

发布于 2024-10-29 12:09:03 字数 172 浏览 7 评论 0原文

以下引用来自 Addison Wesley 的 C++ 模板。有人可以帮我用简单的英语/外行术语理解它的要点吗?

因为字符串文字是具有内部链接的对象(具有相同值但在不同模块中的两个字符串文字是不同的对象),因此您也不能将它们用作模板参数:

The following quote is from C++ Templates by Addison Wesley. Could someone please help me understand in plain English/layman's terms its gist?

Because string literals are objects with internal linkage (two string literals with the same value but in different modules are different objects), you can't use them as template arguments either:

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

少年亿悲伤 2024-11-05 12:09:03

您的编译器最终会在名为 翻译单元,非正式地称为源文件。在这些翻译单元中,您可以识别不同的实体:对象、函数等。链接器的工作是将这些单元连接在一起,并且该过程的一部分是合并身份。

标识符具有链接内部链接意味着该翻译单元中命名的实体仅对该翻译单元可见,而外部链接链接意味着该实体对其他单元可见。

当实体被标记为静态时,它被赋予内部链接。因此,给定这两个翻译单元:

// a.cpp
static void foo() { /* in a */ } 

// b.cpp
static void foo() { /* in a */ } 

每个 foo 都引用一个仅对其各自的翻译单元可见的实体(在本例中为函数);也就是说,每个翻译单元都有自己的foo

那么,这里有一个问题:字符串文字与 static const char[..] 的类型相同。也就是说:

// str.cpp
#include <iostream>

// this code:

void bar()
{
    std::cout << "abc" << std::endl;
}

// is conceptually equivalent to:

static const char[4] __literal0 = {'a', 'b', 'c', 0};

void bar()
{
    std::cout << __literal0 << std::endl;
}

正如您所看到的,文字的值是该翻译单元的内部值。因此,例如,如果您在多个翻译单元中使用 "abc",它们最终都会成为不同的实体。

总体而言,这意味着这在概念上毫无意义:

template <const char* String>
struct baz {};

typedef baz<"abc"> incoherent;

因为 “abc”对于每个翻译单元不同。每个翻译单元都会被赋予一个不同的类,因为每个“abc”都是不同的实体,即使它们提供了“相同”的参数。

在语言级别上,这是通过说模板非类型参数可以是指向具有外部链接的实体的指针来强加的;也就是说,确实跨翻译单元引用同一实体。

所以这很好:

// good.hpp
extern const char* my_string;

// good.cpp
const char* my_string = "any string";

// anything.cpp
typedef baz<my_string> coherent; // okay; all instantiations use the same entity

†并非所有标识符都具有链接;有些没有,例如函数参数。

‡ 优化编译器将在同一地址存储相同的文字,以节省空间;但这是实施细节的质量,而不是保证。

Your compiler ultimately operates on things called translation units, informally called source files. Within these translation units, you identify different entities: objects, functions, etc. The linkers job is to connect these units together, and part of that process is merging identities.

Identifiers have linkage: internal linkage means that the entity named in that translation unit is only visible to that translation unit, while external linkage means that the entity is visible to other units.

When an entity is marked static, it is given internal linkage. So given these two translation units:

// a.cpp
static void foo() { /* in a */ } 

// b.cpp
static void foo() { /* in a */ } 

Each of those foo's refer to an entity (a function in this case) that is only visible to their respective translation units; that is, each translation unit has its own foo.

Here's the catch, then: string literals are the same type as static const char[..]. That is:

// str.cpp
#include <iostream>

// this code:

void bar()
{
    std::cout << "abc" << std::endl;
}

// is conceptually equivalent to:

static const char[4] __literal0 = {'a', 'b', 'c', 0};

void bar()
{
    std::cout << __literal0 << std::endl;
}

And as you can see, the literal's value is internal to that translation unit. So if you use "abc" in multiple translation units, for example, they all end up being different entities.

Overall, that means this is conceptually meaningless:

template <const char* String>
struct baz {};

typedef baz<"abc"> incoherent;

Because "abc" is different for each translation unit. Each translation unit would be given a different class because each "abc" is a different entity, even though they provided the "same" argument.

On the language level, this is imposed by saying that template non-type parameters can be pointers to entities with external linkage; that is, things that do refer to the same entity across translation units.

So this is fine:

// good.hpp
extern const char* my_string;

// good.cpp
const char* my_string = "any string";

// anything.cpp
typedef baz<my_string> coherent; // okay; all instantiations use the same entity

†Not all identifiers have linkage; some have none, such as function parameters.

‡ An optimizing compiler will store identical literals at the same address, to save space; but that's a quality of implementation detail, not a guarantee.

何必那么矫情 2024-11-05 12:09:03

这意味着你不能这样做...

#include <iostream>

template <const char* P>
void f() { std::cout << P << '\n'; }

int main()
{
    f<"hello there">();
}

...因为 "hello there" 不能 100% 保证解析为可用于实例化模板一次的单个整数值(尽管大多数好的链接器将尝试折叠链接对象之间的所有用法,并生成具有字符串的单个副本的新对象)。

但是,您可以使用外部字符数组/指针:

...
extern const char p[];
const char p[] = "hello";
...
    f<p>();
...

It means you can't do this...

#include <iostream>

template <const char* P>
void f() { std::cout << P << '\n'; }

int main()
{
    f<"hello there">();
}

...because "hello there" isn't 100% guaranteed to resolve to a single integral value that can be used to instantiate the template once (though most good linkers will attempt to fold all usages across linked objects and produce a new object with a single copy of the string).

You can, however, use extern character arrays/pointers:

...
extern const char p[];
const char p[] = "hello";
...
    f<p>();
...
且行且努力 2024-11-05 12:09:03

显然,像“foobar”这样的字符串文字与其他文字内置类型(例如 int 或 float)不同。他们需要有一个地址(const char*)。地址实际上是编译器替换文字出现位置的常量值。该地址指向程序内存中的某个位置,在编译时固定。

因此它必须具有内部联系。内部链接只是意味着不能跨翻译单元(编译后的 cpp 文件)链接。编译器可以尝试这样做,但不是必需的。换句话说,内部链接意味着如果您在不同的 cpp 文件中获取两个相同文字字符串的地址(即它们转换为的 const char* 的值),那么它们通常不会相同。

您不能将它们用作模板参数,因为它们需要 strcmp() 来检查它们是否相同。如果您使用 ==,您只会比较地址,当模板在不同翻译单元中使用相同的文字字符串实例化时,地址不会相同。

其他更简单的内置类型(如文字)也是内部链接(它们没有标识符,不能从不同的翻译单元链接在一起)。然而,它们的比较是微不足道的,因为它是按价值进行的。因此它们可以用作模板。

Obviously, string literals like "foobar" are not like other literal built-in types (like int or float). They need to have an address (const char*). The address is really the constant value that the compiler substitutes in place of where the literal appears. That address points to somewhere, fixed at compile-time, in the program's memory.

It has to be of internal linkage because of that. Internal linkage just means that cannot be linked across translation units (compiled cpp files). The compiler could try to do this, but is not required to. In other words, internal linkage means that if you took the address of two identical literal strings (i.e. the value of the const char* they translate to) in different cpp files, they wouldn't be the same, in general.

You can't use them as template parameters because they would require a strcmp() to check that they are the same. If you used the ==, you would just be comparing the addresses, which wouldn't be the same when template are instantiated with the same literal string in different translation units.

Other simpler built-in types, as literals, are also internal linkage (they don't have an identifier and can't be linked together from different translation units). However, their comparison is trivial, as it is by value. So they can be used for templates.

悲念泪 2024-11-05 12:09:03

正如其他答案中提到的,字符串文字不能用作模板参数。
然而,有一种解决方法具有类似的效果,但“字符串”仅限于四个字符。这是由于 多字符常量 造成的,正如链接中所讨论的,它可能相当不可移植,但适用于我的调试目的。

template<int32_t nFourCharName>
class NamedClass
{
    std::string GetName(void) const
    {
        // Evil code to extract the four-character name:
        const char cNamePart1 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*3) & 0xFF);
        const char cNamePart2 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*2) & 0xFF);
        const char cNamePart3 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*1) & 0xFF);
        const char cNamePart4 = static_cast<char>(static_cast<uint32_t>(nFourCharName       ) & 0xFF);

        std::ostringstream ossName;
        ossName << cNamePart1 << cNamePart2 << cNamePart3 << cNamePart4;
        return ossName.str();
    }
};

可用于:

NamedClass<'Greg'> greg;
NamedClass<'Fred'> fred;
std::cout << greg.GetName() << std::endl;  // "Greg"
std::cout << fred.GetName() << std::endl;  // "Fred"

正如我所说,这是一种解决方法。我并不假装这是好的、干净的、可移植的代码,但其他人可能会发现它很有用。
另一种解决方法可能涉及多个 char 模板参数,如此答案中所示。

As mentioned in other answers, a string literal cannot be used as a template argument.
There is, however, a workaround which has a similar effect, but the "string" is limited to four characters. This is due to multi-character constants which, as discussed in the link, are probably rather unportable, but worked for my debug purposes.

template<int32_t nFourCharName>
class NamedClass
{
    std::string GetName(void) const
    {
        // Evil code to extract the four-character name:
        const char cNamePart1 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*3) & 0xFF);
        const char cNamePart2 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*2) & 0xFF);
        const char cNamePart3 = static_cast<char>(static_cast<uint32_t>(nFourCharName >> 8*1) & 0xFF);
        const char cNamePart4 = static_cast<char>(static_cast<uint32_t>(nFourCharName       ) & 0xFF);

        std::ostringstream ossName;
        ossName << cNamePart1 << cNamePart2 << cNamePart3 << cNamePart4;
        return ossName.str();
    }
};

Can be used with:

NamedClass<'Greg'> greg;
NamedClass<'Fred'> fred;
std::cout << greg.GetName() << std::endl;  // "Greg"
std::cout << fred.GetName() << std::endl;  // "Fred"

As I said, this is a workaround. I don't pretend this is good, clean, portable code, but others may find it useful.
Another workaround could involve multiple char template arguments, as in this answer.

生活了然无味 2024-11-05 12:09:03

C++ 标准仅允许模板使用某些类型的参数的想法是,参数应该是常量并且在编译时已知,以便生成“专用类”代码。

对于这个具体案例:
当您创建字符串文字时,它们的地址在链接时间之前是未知的(链接在编译后发生),因为跨不同翻译单元的两个字符串文字是两个不同的对象(正如接受的答案所解释的那样)。当编译发生时,我们不知道使用哪个字符串文字的地址来从模板类生成专用类代码。

Idea of c++ standard only allowing certain type of parameters to the templates is that parameter should be constant and known at compile time in order to generate "specialized class" code.

For this specific case:
When you create string literal their address is unknown until linking time (linking happens after compilation) because two string literals across different translation units are two different objects (as explained brilliantly by accepted answer). When compilation happens we don't know which string literal's address to use to generate the specialized class code from template class.

别低头,皇冠会掉 2024-11-05 12:09:03

在 C++20 中,您可以使用内联外部链接做得更好:

Demo

#include <cstdio>

inline const char myliteral[] = "Hello";

template <const char* Literal>
struct mystruct
{
    auto print() {
        printf(Literal);
    }
};

int main(){
    mystruct<myliteral> obj;
    
    obj.print();
}

本质上它的作用是让链接器来确定最终使用哪个定义。

In C++20 you can do better using inline external linkeage:

Demo

#include <cstdio>

inline const char myliteral[] = "Hello";

template <const char* Literal>
struct mystruct
{
    auto print() {
        printf(Literal);
    }
};

int main(){
    mystruct<myliteral> obj;
    
    obj.print();
}

Essentially what it does it leaves it for the linker to figure out which definition is used eventually.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文