线程安全类的有序静态初始化

发布于 2024-12-26 07:52:19 字数 5613 浏览 0 评论 0原文

对于最后的简短问题来说,这篇文章可能看起来太长了。但我还需要描述我刚刚想到的一个设计模式。也许它很常用,但我从未见过它(或者也许它只是不起作用:)。

首先,这里的代码(据我理解)由于“静态初始化顺序失败”而具有未定义的行为。问题在于,Spanish::s_englishToSpanish 的初始化依赖于 English::s_numberToStr,它们都是静态初始化的并且位于不同的文件中,因此这些初始化的顺序未定义:

文件:English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    static vector<string>* s_numberToStr;
    string m_str;

    explicit English(int number)
    {
        m_str = (*s_numberToStr)[number];
    }
};

文件:English.cpp

#include "English.h"

vector<string>* English::s_numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
    vector<string> numberToStr;
    numberToStr.push_back("zero");
    numberToStr.push_back("one");
    numberToStr.push_back("two");
    return numberToStr;
}());

文件: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

typedef map<string, string> MapType;

struct Spanish {
    static MapType* s_englishToSpanish;
    string m_str;

    explicit Spanish(const English& english)
    {
        m_str = (*s_englishToSpanish)[english.m_str];
    }
};

文件:Spanish.cpp

#include "Spanish.h"

MapType* Spanish::s_englishToSpanish = new MapType( /*split*/
[]() -> MapType
{
    MapType englishToSpanish;
    englishToSpanish[ English(0).m_str ] = "cero";
    englishToSpanish[ English(1).m_str ] = "uno";
    englishToSpanish[ English(2).m_str ] = "dos";
    return englishToSpanish;
}());

文件:StaticFiasco.h

#include <stdio.h>
#include <tchar.h>
#include <conio.h>

#include "Spanish.h"

int _tmain(int argc, _TCHAR* argv[])
{
    _cprintf( Spanish(English(1)).m_str.c_str() ); // may print "uno" or crash

    _getch();
    return 0;
}

为了解决静态初始化顺序问题,我们使用“首次使用时构造”习惯用法,并使这些静态初始化成为函数本地函数,如下所示

: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    string m_str;

    explicit English(int number)
    {
        static vector<string>* numberToStr = new vector<string>( /*split*/
        []() -> vector<string>
        {
            vector<string> numberToStr_;
            numberToStr_.push_back("zero");
            numberToStr_.push_back("one");
            numberToStr_.push_back("two");
            return numberToStr_;
        }());

        m_str = (*numberToStr)[number];
    }
};

文件:Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

struct Spanish {
    string m_str;

    explicit Spanish(const English& english)
    {
        typedef map<string, string> MapT;

        static MapT* englishToSpanish = new MapT( /*split*/
        []() -> MapT
        {
            MapT englishToSpanish_;
            englishToSpanish_[ English(0).m_str ] = "cero";
            englishToSpanish_[ English(1).m_str ] = "uno";
            englishToSpanish_[ English(2).m_str ] = "dos";
            return englishToSpanish_;
        }());

        m_str = (*englishToSpanish)[english.m_str];
    }
};

但现在我们有另一个问题。由于函数局部静态数据,这些类都不是线程安全的。为了解决这个问题,我们向这两个类添加一个静态成员变量和一个初始化函数。然后,在该函数内,我们通过调用每个具有函数局部静态数据的函数一次来强制初始化所有函数局部静态数据。因此,我们实际上在程序开始时初始化所有内容,但仍然控制初始化的顺序。所以现在我们的类应该是线程安全的:

文件:English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    static bool s_areStaticsInitialized;
    string m_str;

    explicit English(int number)
    {
        static vector<string>* numberToStr = new vector<string>( /*split*/
        []() -> vector<string>
        {
            vector<string> numberToStr_;
            numberToStr_.push_back("zero");
            numberToStr_.push_back("one");
            numberToStr_.push_back("two");
            return numberToStr_;
        }());

        m_str = (*numberToStr)[number];
    }

    static bool initializeStatics()
    {
        // Call every member function that has local static data in it:
        English english(0); // Could the compiler ignore this line?
        return true;
    }
};
bool English::s_areStaticsInitialized = initializeStatics();

文件:Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

struct Spanish {
    static bool s_areStaticsInitialized;
    string m_str;

    explicit Spanish(const English& english)
    {
        typedef map<string, string> MapT;

        static MapT* englishToSpanish = new MapT( /*split*/
        []() -> MapT
        {
            MapT englishToSpanish_;
            englishToSpanish_[ English(0).m_str ] = "cero";
            englishToSpanish_[ English(1).m_str ] = "uno";
            englishToSpanish_[ English(2).m_str ] = "dos";
            return englishToSpanish_;
        }());

        m_str = (*englishToSpanish)[english.m_str];
    }

    static bool initializeStatics()
    {
        // Call every member function that has local static data in it:
        Spanish spanish( English(0) ); // Could the compiler ignore this line?
        return true;
    }
};

bool Spanish::s_areStaticsInitialized = initializeStatics();

这里的问题是:某些编译器是否可能优化那些对具有本地静态数据的函数(在本例中为构造函数)的调用?所以问题是“有副作用”到底是什么,据我理解,这意味着编译器不允许将其优化掉。拥有函数局部静态数据是否足以使编译器认为函数调用不能被忽略?

This post may seem overly long for just the short question at the end of it. But I also need to describe a design pattern I just came up with. Maybe it's commonly used, but I've never seen it (or maybe it just doesn't work :).

First, here's a code which (to my understanding) has undefined behavior due to "static initialization order fiasco". The problem is that the initialization of Spanish::s_englishToSpanish is dependent on English::s_numberToStr, which are both static initialized and in different files, so the order of those initializations is undefined:

File: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    static vector<string>* s_numberToStr;
    string m_str;

    explicit English(int number)
    {
        m_str = (*s_numberToStr)[number];
    }
};

File: English.cpp

#include "English.h"

vector<string>* English::s_numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
    vector<string> numberToStr;
    numberToStr.push_back("zero");
    numberToStr.push_back("one");
    numberToStr.push_back("two");
    return numberToStr;
}());

File: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

typedef map<string, string> MapType;

struct Spanish {
    static MapType* s_englishToSpanish;
    string m_str;

    explicit Spanish(const English& english)
    {
        m_str = (*s_englishToSpanish)[english.m_str];
    }
};

File: Spanish.cpp

#include "Spanish.h"

MapType* Spanish::s_englishToSpanish = new MapType( /*split*/
[]() -> MapType
{
    MapType englishToSpanish;
    englishToSpanish[ English(0).m_str ] = "cero";
    englishToSpanish[ English(1).m_str ] = "uno";
    englishToSpanish[ English(2).m_str ] = "dos";
    return englishToSpanish;
}());

File: StaticFiasco.h

#include <stdio.h>
#include <tchar.h>
#include <conio.h>

#include "Spanish.h"

int _tmain(int argc, _TCHAR* argv[])
{
    _cprintf( Spanish(English(1)).m_str.c_str() ); // may print "uno" or crash

    _getch();
    return 0;
}

To solve the static initialization order problem, we use the construct-on-first-use idiom, and make those static initializations function-local like so:

File: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    string m_str;

    explicit English(int number)
    {
        static vector<string>* numberToStr = new vector<string>( /*split*/
        []() -> vector<string>
        {
            vector<string> numberToStr_;
            numberToStr_.push_back("zero");
            numberToStr_.push_back("one");
            numberToStr_.push_back("two");
            return numberToStr_;
        }());

        m_str = (*numberToStr)[number];
    }
};

File: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

struct Spanish {
    string m_str;

    explicit Spanish(const English& english)
    {
        typedef map<string, string> MapT;

        static MapT* englishToSpanish = new MapT( /*split*/
        []() -> MapT
        {
            MapT englishToSpanish_;
            englishToSpanish_[ English(0).m_str ] = "cero";
            englishToSpanish_[ English(1).m_str ] = "uno";
            englishToSpanish_[ English(2).m_str ] = "dos";
            return englishToSpanish_;
        }());

        m_str = (*englishToSpanish)[english.m_str];
    }
};

But now we have another problem. Due to the function-local static data, neither of those classes is thread-safe. To solve this, we add to both classes a static member variable and an initialization function for it. Then inside this function we force the initialization of all the function-local static data, by calling once each function that has function-local static data. Thus, effectively we're initializing everything at the start of program, but still controlling the order of initialization. So now our classes should be thread-safe:

File: English.h

#pragma once

#include <vector>
#include <string>

using namespace std;

struct English {
    static bool s_areStaticsInitialized;
    string m_str;

    explicit English(int number)
    {
        static vector<string>* numberToStr = new vector<string>( /*split*/
        []() -> vector<string>
        {
            vector<string> numberToStr_;
            numberToStr_.push_back("zero");
            numberToStr_.push_back("one");
            numberToStr_.push_back("two");
            return numberToStr_;
        }());

        m_str = (*numberToStr)[number];
    }

    static bool initializeStatics()
    {
        // Call every member function that has local static data in it:
        English english(0); // Could the compiler ignore this line?
        return true;
    }
};
bool English::s_areStaticsInitialized = initializeStatics();

File: Spanish.h

#pragma once

#include <map>
#include <string>

#include "English.h"

using namespace std;

struct Spanish {
    static bool s_areStaticsInitialized;
    string m_str;

    explicit Spanish(const English& english)
    {
        typedef map<string, string> MapT;

        static MapT* englishToSpanish = new MapT( /*split*/
        []() -> MapT
        {
            MapT englishToSpanish_;
            englishToSpanish_[ English(0).m_str ] = "cero";
            englishToSpanish_[ English(1).m_str ] = "uno";
            englishToSpanish_[ English(2).m_str ] = "dos";
            return englishToSpanish_;
        }());

        m_str = (*englishToSpanish)[english.m_str];
    }

    static bool initializeStatics()
    {
        // Call every member function that has local static data in it:
        Spanish spanish( English(0) ); // Could the compiler ignore this line?
        return true;
    }
};

bool Spanish::s_areStaticsInitialized = initializeStatics();

And here's the question: Is it possible that some compiler might optimize away those calls to functions (constructors in this case) which have local static data? So the question is what exactly amounts to "having side-effects", which to my understanding means the compiler isn't allowed to optimize it away. Is having function-local static data enough to make the compiler think the function call can't be ignored?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

知足的幸福 2025-01-02 07:52:19

C++11 标准的第 1.9 节“程序执行”[intro.execution] 指出

1 本国际标准中的语义描述定义了参数化的非确定性抽象机。 ...需要一致的实现来模拟(仅)抽象机的可观察行为,如下所述。
...

5 执行格式良好的程序的一致实现应产生与具有相同程序和相同输入的抽象机的相应实例的可能执行之一相同的可观察行为。
...

8 一致性实现的最低要求是:
— 对易失性对象的访问严格按照抽象机的规则进行评估。
— 在程序终止时,写入文件的所有数据应与根据抽象语义执行程序可能产生的可能结果之一相同。
— 交互设备的输入和输出动态应以这样的方式发生:在程序等待输入之前提示输出实际上被传递。交互设备的构成是由实现定义的。
这些统称为程序的可观察行为
...

12 访问由易失性泛左值 (3.10) 指定的对象、修改对象、调用库 I/O 函数或调用执行任何这些操作的函数都是副作用,这是执行环境状态的变化。

另外,在 3.7.2“自动存储持续时间”[basic.stc.auto] 中据说

3 如果具有自动存储持续时间的变量具有初始化或具有副作用的析构函数,则不得在其块结束之前将其销毁,即使它看起来未使用,也不得将其作为优化而消除,除非类对象或其复制/移动可以按照 12.8 中的规定消除。

12.8-31 描述了复制省略,我认为这与这里无关。

所以问题是局部变量的初始化是否会产生副作用,从而阻止其被优化掉。由于它可以使用动态对象的地址执行静态变量的初始化,因此我认为它会产生足够的副作用(例如修改对象)。您还可以添加一个带有易失性对象的操作,从而引入无法消除的可观察行为。

Section 1.9 "Program execution" [intro.execution] of the C++11 standard says that

1 The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. ... conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.
...

5 A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input.
...

8 The least requirements on a conforming implementation are:
— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.
— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.
— The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.
These collectively are referred to as the observable behavior of the program.
...

12 Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

Also, in 3.7.2 "Automatic storage duration" [basic.stc.auto] it is said that

3 If a variable with automatic storage duration has initialization or a destructor with side effects, it shall not be destroyed before the end of its block, nor shall it be eliminated as an optimization even if it appears to be unused, except that a class object or its copy/move may be eliminated as specified in 12.8.

12.8-31 describes copy elision which I believe is irrelevant here.

So the question is whether the initialization of your local variables has side effects that prevent it from being optimized away. Since it can perform initialization of a static variable with an address of a dynamic object, I think it produces sufficient side effects (e.g. modifies an object). Also you can add there an operation with a volatile object, thus introducing an observable behavior which cannot be eliminated.

巴黎盛开的樱花 2025-01-02 07:52:19

好吧,简而言之:

  1. 我不明白为什么类的静态成员需要是公共的 - 它们是实现细节。

  2. 实现类的代码将在其中)。
  3. 使用boost::call_once执行静态初始化。

首次使用时的初始化相对容易强制执行顺序,但按顺序执行销毁则要困难得多。但请注意,call_once 中使用的函数不得引发异常。因此,如果它可能失败,您应该留下某种失败状态,并在调用后检查该状态。

(我假设在您的真实示例中,您的负载不是硬编码的内容,而是更有可能加载某种动态表,因此您不能只创建内存中数组)。

Ok, in a nutshell:

  1. I cannot see why the static members of the class need to be public - they are implementation detail.

  2. Do not make them private but instead make them members of the compilation unit (where code that implements your classes will be).

  3. Use boost::call_once to perform the static initialisation.

Initialisation on first use is relatively easy to enforce the ordering of, it is the destruction that is far harder to perform in order. Note however that the function used in call_once must not throw an exception. Therefore if it might fail you should leave some kind of failed state and check for that after the call.

(I will assume that in your real example, your load is not something you hard-code in but more likely you load some kind of dynamic table, so you can't just create an in-memory array).

爱格式化 2025-01-02 07:52:19

为什么不将 English::s_numberToStr 隐藏在公共静态函数后面并完全跳过构造函数语法?使用DCLP 确保线程安全。

我强烈建议避免类静态变量的初始化涉及不小的副作用。作为一般的设计模式,它们往往会导致比解决的问题更多的问题。无论您在这里担心什么性能问题,都需要有理由,因为我怀疑它们在现实环境下是否可以衡量。

Why don't you just hide English::s_numberToStr behind a public static function and skip the constructor syntax entirely? Use DCLP to ensure thread-safety.

I strongly recommend avoiding class static variables whose initialization involves non-trivial side-effects. As a general design pattern, they tend to cause more problems than they solve. Whatever performance problems you're concerned about here needs justification because I'm doubtful that they are measurable under real-world circumstances.

绝影如岚 2025-01-02 07:52:19

也许您需要做额外的工作来控制 init 顺序。
就像,

class staticObjects
{
    private:
    vector<string>* English::s_numberToStr;
    MapType* s_englishToSpanish;
};

static staticObjects objects = new staticObjects();

然后定义一些接口来检索它。

maybe you need to do extra work to control the init order.
like,

class staticObjects
{
    private:
    vector<string>* English::s_numberToStr;
    MapType* s_englishToSpanish;
};

static staticObjects objects = new staticObjects();

and then define some interfaces to retrieve it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文