libc++ VS VC++:可以使用 wstring_convert 完成非 UTF 转换吗?

发布于 2024-12-07 05:59:36 字数 966 浏览 1 评论 0原文

C++11 的 std::wstring_convert 对于标准 UTF-8 <-> 非常有效* UTF-16/UCS2/UCS4 转换。但是,当我尝试使用不是来自 的方面实例化 wstring_convert 或 wbuffer_convert 时,它没有按预期工作:

// works as expected
std::wstring_convert<std::codecvt_utf8<wchar_t>> ucs4conv;

// Now, by analogy, I want to try this:
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> gbconv(
        new std::codecvt_byname<wchar_t, char, std::mbstate_t>("zh_CN.gb18030"));

Clang++ 错误提示“调用 codecvt<> 的受保护析构函数”。在 ~wstring_convert"

Visual Studio 允许它(虽然它缺少该语言环境,但那是另一个故事),因为它的 wstring_convert将构面指针的生命周期管理交给它作为成员持有的语言环境对象,并且语言环境知道如何删除指向所有构面的指针。

Visual Studio 正确而 libc++ 错误吗?

* 在 clang++-2.9/libc++-svn 和 Visual Studio 2010 EE SP1 中实现,以下示例适用于两者,但不适用于 GCC,遗憾的是: https://ideone.com/hywz6

The C++11's std::wstring_convert works great* for the standard UTF-8 <-> UTF-16/UCS2/UCS4 conversions. However, when I attempted to instantiate a wstring_convert or wbuffer_convert with a facet not from <codecvt>, it didn't work as expected:

// works as expected
std::wstring_convert<std::codecvt_utf8<wchar_t>> ucs4conv;

// Now, by analogy, I want to try this:
std::wstring_convert<std::codecvt<wchar_t, char, std::mbstate_t>> gbconv(
        new std::codecvt_byname<wchar_t, char, std::mbstate_t>("zh_CN.gb18030"));

Clang++ errors out saying "calling a protected destructor of codecvt<> in ~wstring_convert"

Visual Studio allows it (although it lacks that locale, but that's another story), because its wstring_convert pawns the lifetime management of the facet pointer off to a locale object it holds as a member, and locales know how to delete pointers to all facets.

Is Visual Studio right and libc++ wrong?

* as implemented in clang++-2.9/libc++-svn and Visual Studio 2010 EE SP1, the following example works on both, but not in GCC, sadly: https://ideone.com/hywz6

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

北方的巷 2024-12-14 05:59:36

诚然,我对这个答案有偏见。但我将尝试引用 N3290(不幸的是不再公开)来支持我的主张。我也会提供一个解决方案。

分析:

[conversions.string]/p2中wstring_convert的概要包括:

private:
  byte_string byte_err_string;  // exposition only
  wide_string wide_err_string;  // exposition only
  Codecvt *cvtptr;              // exposition only
  state_type cvtstate;          // exposition only
  size_t cvtcount;              // exposition only

“仅说明”意味着wstring_convert没有通过这种拼写使这些成员按此顺序排列。但是“仅说明”成员用于描述各种成员的效果,并且这些规范具有约束力。

所以问题似乎变成了:

~wstring_convert()的规范是什么?

这可以在同一部分的第 17 页中找到([conversions.string]):

~wstring_convert();

效果:析构函数应删除cvtptr

这对我来说意味着 ~Codecvt() 必须是可访问的,因此 libc++ 遵循 C++11 规范。

我也同意这是一个巨大的痛苦。

解决方案:

让所有 C++98/03 方面都具有受保护的析构函数已证明非常不方便。这是一个可以接受任何方面并为其提供公共析构函数的适配器:

template <class Facet>
class usable_facet
    : public Facet
{
public:
    template <class ...Args>
        usable_facet(Args&& ...args)
            : Facet(std::forward<Args>(args)...) {}
    ~usable_facet() {}
};

您现在可以在代码中使用此通用适配器:

typedef usable_facet<std::codecvt<wchar_t, char, std::mbstate_t>> C;
std::wstring_convert<C> gbconv(new C("zh_CN.gb18030"));

希望这会有所帮助。

I am admittedly biased in this answer. But I will attempt to back up my claims with references to N3290 (which is unfortunately no longer publicly available). And I will also offer a solution.

Analysis:

The synopsis of wstring_convert in [conversions.string]/p2 includes:

private:
  byte_string byte_err_string;  // exposition only
  wide_string wide_err_string;  // exposition only
  Codecvt *cvtptr;              // exposition only
  state_type cvtstate;          // exposition only
  size_t cvtcount;              // exposition only

The "exposition only" means that the wstring_convert doesn't have to have these members in this order by this spelling. But "exposition only" members are used to describe the effects of various members, and those specifications are binding.

And so the question appears to become:

What is the specification of ~wstring_convert()?

This is found in p17 of the same section ([conversions.string]):

~wstring_convert();

Effects: The destructor shall delete cvtptr.

That implies to me that ~Codecvt() must be accessible, and therefore libc++ is following the C++11 specification.

I would also agree that this is a royal pain in the butt.

Solution:

Having all of the C++98/03 facets have protected destructors has turned out to be very inconvenient. Here's an adaptor that can take any facet and give it a public destructor:

template <class Facet>
class usable_facet
    : public Facet
{
public:
    template <class ...Args>
        usable_facet(Args&& ...args)
            : Facet(std::forward<Args>(args)...) {}
    ~usable_facet() {}
};

You can now use this general purpose adaptor in your code:

typedef usable_facet<std::codecvt<wchar_t, char, std::mbstate_t>> C;
std::wstring_convert<C> gbconv(new C("zh_CN.gb18030"));

Hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文