线程安全类的有序静态初始化
对于最后的简短问题来说,这篇文章可能看起来太长了。但我还需要描述我刚刚想到的一个设计模式。也许它很常用,但我从未见过它(或者也许它只是不起作用:)。
首先,这里的代码(据我理解)由于“静态初始化顺序失败”而具有未定义的行为。问题在于,Spanish::s_englishToSpanish 的初始化依赖于 English::s_numberToStr,它们都是静态初始化的并且位于不同的文件中,因此这些初始化的顺序未定义:
文件:English.h
#pragma once
#include <vector>
#include <string>
using namespace std;
struct English {
static vector<string>* s_numberToStr;
string m_str;
explicit English(int number)
{
m_str = (*s_numberToStr)[number];
}
};
文件:English.cpp
#include "English.h"
vector<string>* English::s_numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
vector<string> numberToStr;
numberToStr.push_back("zero");
numberToStr.push_back("one");
numberToStr.push_back("two");
return numberToStr;
}());
文件: Spanish.h
#pragma once
#include <map>
#include <string>
#include "English.h"
using namespace std;
typedef map<string, string> MapType;
struct Spanish {
static MapType* s_englishToSpanish;
string m_str;
explicit Spanish(const English& english)
{
m_str = (*s_englishToSpanish)[english.m_str];
}
};
文件:Spanish.cpp
#include "Spanish.h"
MapType* Spanish::s_englishToSpanish = new MapType( /*split*/
[]() -> MapType
{
MapType englishToSpanish;
englishToSpanish[ English(0).m_str ] = "cero";
englishToSpanish[ English(1).m_str ] = "uno";
englishToSpanish[ English(2).m_str ] = "dos";
return englishToSpanish;
}());
文件:StaticFiasco.h
#include <stdio.h>
#include <tchar.h>
#include <conio.h>
#include "Spanish.h"
int _tmain(int argc, _TCHAR* argv[])
{
_cprintf( Spanish(English(1)).m_str.c_str() ); // may print "uno" or crash
_getch();
return 0;
}
为了解决静态初始化顺序问题,我们使用“首次使用时构造”习惯用法,并使这些静态初始化成为函数本地函数,如下所示
: English.h
#pragma once
#include <vector>
#include <string>
using namespace std;
struct English {
string m_str;
explicit English(int number)
{
static vector<string>* numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
vector<string> numberToStr_;
numberToStr_.push_back("zero");
numberToStr_.push_back("one");
numberToStr_.push_back("two");
return numberToStr_;
}());
m_str = (*numberToStr)[number];
}
};
文件:Spanish.h
#pragma once
#include <map>
#include <string>
#include "English.h"
using namespace std;
struct Spanish {
string m_str;
explicit Spanish(const English& english)
{
typedef map<string, string> MapT;
static MapT* englishToSpanish = new MapT( /*split*/
[]() -> MapT
{
MapT englishToSpanish_;
englishToSpanish_[ English(0).m_str ] = "cero";
englishToSpanish_[ English(1).m_str ] = "uno";
englishToSpanish_[ English(2).m_str ] = "dos";
return englishToSpanish_;
}());
m_str = (*englishToSpanish)[english.m_str];
}
};
但现在我们有另一个问题。由于函数局部静态数据,这些类都不是线程安全的。为了解决这个问题,我们向这两个类添加一个静态成员变量和一个初始化函数。然后,在该函数内,我们通过调用每个具有函数局部静态数据的函数一次来强制初始化所有函数局部静态数据。因此,我们实际上在程序开始时初始化所有内容,但仍然控制初始化的顺序。所以现在我们的类应该是线程安全的:
文件:English.h
#pragma once
#include <vector>
#include <string>
using namespace std;
struct English {
static bool s_areStaticsInitialized;
string m_str;
explicit English(int number)
{
static vector<string>* numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
vector<string> numberToStr_;
numberToStr_.push_back("zero");
numberToStr_.push_back("one");
numberToStr_.push_back("two");
return numberToStr_;
}());
m_str = (*numberToStr)[number];
}
static bool initializeStatics()
{
// Call every member function that has local static data in it:
English english(0); // Could the compiler ignore this line?
return true;
}
};
bool English::s_areStaticsInitialized = initializeStatics();
文件:Spanish.h
#pragma once
#include <map>
#include <string>
#include "English.h"
using namespace std;
struct Spanish {
static bool s_areStaticsInitialized;
string m_str;
explicit Spanish(const English& english)
{
typedef map<string, string> MapT;
static MapT* englishToSpanish = new MapT( /*split*/
[]() -> MapT
{
MapT englishToSpanish_;
englishToSpanish_[ English(0).m_str ] = "cero";
englishToSpanish_[ English(1).m_str ] = "uno";
englishToSpanish_[ English(2).m_str ] = "dos";
return englishToSpanish_;
}());
m_str = (*englishToSpanish)[english.m_str];
}
static bool initializeStatics()
{
// Call every member function that has local static data in it:
Spanish spanish( English(0) ); // Could the compiler ignore this line?
return true;
}
};
bool Spanish::s_areStaticsInitialized = initializeStatics();
这里的问题是:某些编译器是否可能优化那些对具有本地静态数据的函数(在本例中为构造函数)的调用?所以问题是“有副作用”到底是什么,据我理解,这意味着编译器不允许将其优化掉。拥有函数局部静态数据是否足以使编译器认为函数调用不能被忽略?
This post may seem overly long for just the short question at the end of it. But I also need to describe a design pattern I just came up with. Maybe it's commonly used, but I've never seen it (or maybe it just doesn't work :).
First, here's a code which (to my understanding) has undefined behavior due to "static initialization order fiasco". The problem is that the initialization of Spanish::s_englishToSpanish is dependent on English::s_numberToStr, which are both static initialized and in different files, so the order of those initializations is undefined:
File: English.h
#pragma once
#include <vector>
#include <string>
using namespace std;
struct English {
static vector<string>* s_numberToStr;
string m_str;
explicit English(int number)
{
m_str = (*s_numberToStr)[number];
}
};
File: English.cpp
#include "English.h"
vector<string>* English::s_numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
vector<string> numberToStr;
numberToStr.push_back("zero");
numberToStr.push_back("one");
numberToStr.push_back("two");
return numberToStr;
}());
File: Spanish.h
#pragma once
#include <map>
#include <string>
#include "English.h"
using namespace std;
typedef map<string, string> MapType;
struct Spanish {
static MapType* s_englishToSpanish;
string m_str;
explicit Spanish(const English& english)
{
m_str = (*s_englishToSpanish)[english.m_str];
}
};
File: Spanish.cpp
#include "Spanish.h"
MapType* Spanish::s_englishToSpanish = new MapType( /*split*/
[]() -> MapType
{
MapType englishToSpanish;
englishToSpanish[ English(0).m_str ] = "cero";
englishToSpanish[ English(1).m_str ] = "uno";
englishToSpanish[ English(2).m_str ] = "dos";
return englishToSpanish;
}());
File: StaticFiasco.h
#include <stdio.h>
#include <tchar.h>
#include <conio.h>
#include "Spanish.h"
int _tmain(int argc, _TCHAR* argv[])
{
_cprintf( Spanish(English(1)).m_str.c_str() ); // may print "uno" or crash
_getch();
return 0;
}
To solve the static initialization order problem, we use the construct-on-first-use idiom, and make those static initializations function-local like so:
File: English.h
#pragma once
#include <vector>
#include <string>
using namespace std;
struct English {
string m_str;
explicit English(int number)
{
static vector<string>* numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
vector<string> numberToStr_;
numberToStr_.push_back("zero");
numberToStr_.push_back("one");
numberToStr_.push_back("two");
return numberToStr_;
}());
m_str = (*numberToStr)[number];
}
};
File: Spanish.h
#pragma once
#include <map>
#include <string>
#include "English.h"
using namespace std;
struct Spanish {
string m_str;
explicit Spanish(const English& english)
{
typedef map<string, string> MapT;
static MapT* englishToSpanish = new MapT( /*split*/
[]() -> MapT
{
MapT englishToSpanish_;
englishToSpanish_[ English(0).m_str ] = "cero";
englishToSpanish_[ English(1).m_str ] = "uno";
englishToSpanish_[ English(2).m_str ] = "dos";
return englishToSpanish_;
}());
m_str = (*englishToSpanish)[english.m_str];
}
};
But now we have another problem. Due to the function-local static data, neither of those classes is thread-safe. To solve this, we add to both classes a static member variable and an initialization function for it. Then inside this function we force the initialization of all the function-local static data, by calling once each function that has function-local static data. Thus, effectively we're initializing everything at the start of program, but still controlling the order of initialization. So now our classes should be thread-safe:
File: English.h
#pragma once
#include <vector>
#include <string>
using namespace std;
struct English {
static bool s_areStaticsInitialized;
string m_str;
explicit English(int number)
{
static vector<string>* numberToStr = new vector<string>( /*split*/
[]() -> vector<string>
{
vector<string> numberToStr_;
numberToStr_.push_back("zero");
numberToStr_.push_back("one");
numberToStr_.push_back("two");
return numberToStr_;
}());
m_str = (*numberToStr)[number];
}
static bool initializeStatics()
{
// Call every member function that has local static data in it:
English english(0); // Could the compiler ignore this line?
return true;
}
};
bool English::s_areStaticsInitialized = initializeStatics();
File: Spanish.h
#pragma once
#include <map>
#include <string>
#include "English.h"
using namespace std;
struct Spanish {
static bool s_areStaticsInitialized;
string m_str;
explicit Spanish(const English& english)
{
typedef map<string, string> MapT;
static MapT* englishToSpanish = new MapT( /*split*/
[]() -> MapT
{
MapT englishToSpanish_;
englishToSpanish_[ English(0).m_str ] = "cero";
englishToSpanish_[ English(1).m_str ] = "uno";
englishToSpanish_[ English(2).m_str ] = "dos";
return englishToSpanish_;
}());
m_str = (*englishToSpanish)[english.m_str];
}
static bool initializeStatics()
{
// Call every member function that has local static data in it:
Spanish spanish( English(0) ); // Could the compiler ignore this line?
return true;
}
};
bool Spanish::s_areStaticsInitialized = initializeStatics();
And here's the question: Is it possible that some compiler might optimize away those calls to functions (constructors in this case) which have local static data? So the question is what exactly amounts to "having side-effects", which to my understanding means the compiler isn't allowed to optimize it away. Is having function-local static data enough to make the compiler think the function call can't be ignored?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
C++11 标准的第 1.9 节“程序执行”[intro.execution] 指出
另外,在 3.7.2“自动存储持续时间”[basic.stc.auto] 中据说
12.8-31 描述了复制省略,我认为这与这里无关。
所以问题是局部变量的初始化是否会产生副作用,从而阻止其被优化掉。由于它可以使用动态对象的地址执行静态变量的初始化,因此我认为它会产生足够的副作用(例如修改对象)。您还可以添加一个带有易失性对象的操作,从而引入无法消除的可观察行为。
Section 1.9 "Program execution" [intro.execution] of the C++11 standard says that
Also, in 3.7.2 "Automatic storage duration" [basic.stc.auto] it is said that
12.8-31 describes copy elision which I believe is irrelevant here.
So the question is whether the initialization of your local variables has side effects that prevent it from being optimized away. Since it can perform initialization of a static variable with an address of a dynamic object, I think it produces sufficient side effects (e.g. modifies an object). Also you can add there an operation with a volatile object, thus introducing an observable behavior which cannot be eliminated.
好吧,简而言之:
我不明白为什么类的静态成员需要是公共的 - 它们是实现细节。
使用
boost::call_once
执行静态初始化。首次使用时的初始化相对容易强制执行顺序,但按顺序执行销毁则要困难得多。但请注意,call_once 中使用的函数不得引发异常。因此,如果它可能失败,您应该留下某种失败状态,并在调用后检查该状态。
(我假设在您的真实示例中,您的负载不是硬编码的内容,而是更有可能加载某种动态表,因此您不能只创建内存中数组)。
Ok, in a nutshell:
I cannot see why the static members of the class need to be public - they are implementation detail.
Do not make them private but instead make them members of the compilation unit (where code that implements your classes will be).
Use
boost::call_once
to perform the static initialisation.Initialisation on first use is relatively easy to enforce the ordering of, it is the destruction that is far harder to perform in order. Note however that the function used in call_once must not throw an exception. Therefore if it might fail you should leave some kind of failed state and check for that after the call.
(I will assume that in your real example, your load is not something you hard-code in but more likely you load some kind of dynamic table, so you can't just create an in-memory array).
为什么不将 English::s_numberToStr 隐藏在公共静态函数后面并完全跳过构造函数语法?使用DCLP 确保线程安全。
我强烈建议避免类静态变量的初始化涉及不小的副作用。作为一般的设计模式,它们往往会导致比解决的问题更多的问题。无论您在这里担心什么性能问题,都需要有理由,因为我怀疑它们在现实环境下是否可以衡量。
Why don't you just hide English::s_numberToStr behind a public static function and skip the constructor syntax entirely? Use DCLP to ensure thread-safety.
I strongly recommend avoiding class static variables whose initialization involves non-trivial side-effects. As a general design pattern, they tend to cause more problems than they solve. Whatever performance problems you're concerned about here needs justification because I'm doubtful that they are measurable under real-world circumstances.
也许您需要做额外的工作来控制 init 顺序。
就像,
然后定义一些接口来检索它。
maybe you need to do extra work to control the init order.
like,
and then define some interfaces to retrieve it.