C++ pimpl 习惯用法与 C 风格相比浪费了一条指令？

发布于 2024-09-02 12:54:55 字数 2073 浏览 6 评论 0原文

（是的，我知道一条机器指令通常并不重要。我问这个问题是因为我想理解 pimpl 习惯用法，并以尽可能最好的方式使用它；而且因为有时我确实关心一条机器指令机器指令。）

在下面的示例代码中，有两个类，Thing 和 其他事情。用户将包含“thing.hh”。 Thing 使用 pimpl 习惯用法来隐藏其实现。 OtherThing 使用 C 风格 – 返回并接受的非成员函数指针。这种风格产生稍微更好的机器代码。我是想知道：有没有一种方法可以使用 C++ 风格——即创建函数到成员函数中——但仍然保存机器指令。我喜欢这种风格，因为它不会污染类外的命名空间。

注意：我只考虑调用成员函数（在本例中为 calc）。我不是在考虑对象分配。

以下是我的 Mac 上的文件、命令和机器代码。

thing.hh:

class ThingImpl;
class Thing
{
    ThingImpl *impl;
public:
    Thing();
    int calc();
};

class OtherThing;    
OtherThing *make_other();
int calc(OtherThing *);

thing.cc:

#include "thing.hh"

struct ThingImpl
{
    int x;
};

Thing::Thing()
{
    impl = new ThingImpl;
    impl->x = 5;
}

int Thing::calc()
{
    return impl->x + 1;
}

struct OtherThing
{
    int x;
};

OtherThing *make_other()
{
    OtherThing *t = new OtherThing;
    t->x = 5;
}

int calc(OtherThing *t)
{
    return t->x + 1;
}

main.cc（只是为了测试代码是否实际工作...）

#include "thing.hh"
#include <cstdio>

int main()
{
    Thing *t = new Thing;
    printf("calc: %d\n", t->calc());

    OtherThing *t2 = make_other();
    printf("calc: %d\n", calc(t2));
}

Makefile：

all: main

thing.o : thing.cc thing.hh
    g++ -fomit-frame-pointer -O2 -c thing.cc

main.o : main.cc thing.hh
    g++ -fomit-frame-pointer -O2 -c main.cc

main: main.o thing.o
    g++ -O2 -o $@ $^

clean: 
    rm *.o
    rm main

运行 make 然后查看机器代码。在 Mac 上，我使用 otool -tv thing.o | c++filt。在 Linux 上，我认为它是 objdump -d thing.o 。这是相关的输出：

事物::calc():
0000000000000000 movq (%rdi),%rax
0000000000000003 movl (%rax),%eax
0000000000000005 包括 %eax
0000000000000007 ret
计算（其他事物*）：
0000000000000010 movl (%rdi),%eax
0000000000000012 包括 %eax
0000000000000014 转

请注意由于指针间接而产生的额外指令。第一个函数查找两个字段（impl，然后 x），而第二个函数只需要获取 x。可以做什么？

原文

(Yes, I know that one machine instruction usually doesn't matter. I'm asking this question because I want to understand the pimpl idiom, and use it in the best possible way; and because sometimes I do care about one machine instruction.)

In the sample code below, there are two classes, Thing and
OtherThing. Users would include "thing.hh".
Thing uses the pimpl idiom to hide it's implementation.
OtherThing uses a C style – non-member functions that return and take
pointers. This style produces slightly better machine code. I'm
wondering: is there a way to use C++ style – ie, make the functions
into member functions – and yet still save the machine instruction. I like this style because it doesn't pollute the namespace outside the class.

Note: I'm only looking at calling member functions (in this case, calc). I'm not looking at object allocation.

Below are the files, commands, and the machine code, on my Mac.

thing.hh:

class ThingImpl;
class Thing
{
    ThingImpl *impl;
public:
    Thing();
    int calc();
};

class OtherThing;    
OtherThing *make_other();
int calc(OtherThing *);

thing.cc:

#include "thing.hh"

struct ThingImpl
{
    int x;
};

Thing::Thing()
{
    impl = new ThingImpl;
    impl->x = 5;
}

int Thing::calc()
{
    return impl->x + 1;
}

struct OtherThing
{
    int x;
};

OtherThing *make_other()
{
    OtherThing *t = new OtherThing;
    t->x = 5;
}

int calc(OtherThing *t)
{
    return t->x + 1;
}

main.cc (just to test the code actually works...)

#include "thing.hh"
#include <cstdio>

int main()
{
    Thing *t = new Thing;
    printf("calc: %d\n", t->calc());

    OtherThing *t2 = make_other();
    printf("calc: %d\n", calc(t2));
}

Makefile:

all: main

thing.o : thing.cc thing.hh
    g++ -fomit-frame-pointer -O2 -c thing.cc

main.o : main.cc thing.hh
    g++ -fomit-frame-pointer -O2 -c main.cc

main: main.o thing.o
    g++ -O2 -o $@ $^

clean: 
    rm *.o
    rm main

Run make and then look at the machine code. On the mac I use otool -tv thing.o | c++filt. On linux I think it's objdump -d thing.o. Here is the relevant output:

Thing::calc():
0000000000000000 movq (%rdi),%rax
0000000000000003 movl (%rax),%eax
0000000000000005 incl %eax
0000000000000007 ret
calc(OtherThing*):
0000000000000010 movl (%rdi),%eax
0000000000000012 incl %eax
0000000000000014 ret

Notice the extra instruction because of the pointer indirection. The first function looks up two fields (impl, then x), while the second only needs to get x. What can be done?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

初见你 2024-09-09 12:54:55

一条指示很少是值得花很多时间担心的事情。首先，编译器可以在更复杂的用例中缓存 pImpl，从而在现实场景中分摊成本。其次，流水线架构几乎不可能预测时钟周期的实际成本。如果您在循环中运行这些操作并对差异进行计时，您将对成本有更现实的了解。

回复收藏 0 原文

累赘 2024-09-09 12:54:55

不太难，只需在课堂上使用相同的技术即可。任何半途而废的优化器都会内联
琐碎的包装。

class ThingImpl;
class Thing
{
    ThingImpl *impl;
    static int calc(ThingImpl*);
public:
    Thing();
    int calc() { calc(impl); }
};

Not too hard, just use the same technique inside your class. Any halfway decent optimizer will inline
the trivial wrapper.

class ThingImpl;
class Thing
{
    ThingImpl *impl;
    static int calc(ThingImpl*);
public:
    Thing();
    int calc() { calc(impl); }
};

回复收藏 0 原文

旧城空念 2024-09-09 12:54:55

有一种令人讨厌的方法，即用足够大的无符号字符数组替换指向 ThingImpl 的指针，然后放置/新重新解释强制转换/显式破坏该 ThingImpl 对象。

或者，您可以仅按值传递 Thing，因为它不应大于指向 ThingImpl 的指针，尽管可能需要多一点（引用计数ThingImpl 会破坏优化，因此您需要某种方法来标记“拥有”Thing，这在某些架构上可能需要额外的空间）。

回复收藏 0 原文

风吹雨成花 2024-09-09 12:54:55

我不同意你的用法：你没有比较两个相同的东西。

#include "thing.hh"
#include <cstdio>

int main()
{
    Thing *t = new Thing;                // 1
    printf("calc: %d\n", t->calc());

    OtherThing *t2 = make_other();       // 2
    printf("calc: %d\n", calc(t2));
}

事实上，您在这里有 2 个 new 调用，一个是显式的，另一个是隐式的（由 Thing 的构造函数完成。
这里有 1 个 new，隐式的（在 2 个内部）

您应该分配 Thing 位于堆栈上，尽管它可能不会更改双重解引用指令...但可能会更改其成本（消除缓存未命中），

但要点是 Thing 管理其自身。内存本身，所以你不能忘记删除实际的内存，而你绝对可以使用 C 风格的方法，

我认为自动内存处理值得额外的内存指令，特别是因为正如所说的那样，如果您多次访问取消引用的值，则可能会对其进行缓存，因此

正确性几乎没有什么比性能更重要的。

I disagree about your usage: you are not comparing the 2 same things.

#include "thing.hh"
#include <cstdio>

int main()
{
    Thing *t = new Thing;                // 1
    printf("calc: %d\n", t->calc());

    OtherThing *t2 = make_other();       // 2
    printf("calc: %d\n", calc(t2));
}

You have in fact 2 calls to new here, one is explicit and the other is implicit (done by the constructor of Thing.
You have 1 new here, implicit (inside 2)

You should allocate Thing on the stack, though it would not probably change the double dereferencing instruction... but could change its cost (remove a cache miss).

However the main point is that Thing manages its memory on its own, so you can't forget to delete the actual memory, while you definitely can with the C-style method.

I would argue that automatic memory handling is worth an extra memory instruction, specifically because as it's been said, the dereferenced value will probably be cached if you access it more than once, thus amounting to almost nothing.

Correctness is more important than performance.

回复收藏 0 原文