从 dll 返回 std::string/std::list

发布于 2024-09-16 12:36:09 字数 319 浏览 4 评论 0 原文

简短的问题。

我刚刚得到了一个需要与之交互的 dll。 Dll 使用 msvcr90D.dll 中的 crt(注意 D),并返回 std::strings、std::lists 和 boost::shared_ptr。运算符 new/delete 没有在任何地方重载。

我假设 crt 混合(发布版本中的 msvcr90.dll,或者使用较新的 crt 重建组件之一等)最终必然会导致问题,并且应该重写 dll 以避免返回任何可能调用 new/delete 的内容(即任何可以在我的代码中调用在 dll 中分配的内存块(可能使用不同的 crt)删除的东西。

我说得对还是不对?

Short question.

I just got a dll I'm supposed to interface with.
Dll uses crt from msvcr90D.dll (notice D), and returns std::strings, std::lists, and boost::shared_ptr. Operator new/delete is not overloaded anywhere.

I assume crt mixup (msvcr90.dll in release build, or if one of components is rebuilt with newer crt, etc) is bound to cause problems eventually, and dll should be rewritten to avoid returning anything that could possibly call new/delete (i.e. anything that could call delete in my code on a block of memory that was allocated (possibly with different crt) in dll).

Am I right or not?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

深海不蓝 2024-09-23 12:36:09

主要要记住的是 dll 包含代码而不是内存。分配的内存属于进程(1)。当您在进程中实例化对象时,您将调用构造函数代码。在该对象的生命周期内,您将调用其他代码(方法)来处理该对象的内存。然后,当对象消失时,将调用析构函数代码。

STL 模板不会从 dll 中显式导出。代码静态链接到每个 dll 中。因此,当在 a.dll 中创建 std::string 并将其传递给 b.dll 时,每个 dll 将具有 string::copy 方法的两个不同实例。 a.dll 中调用的 copy 会调用 a.dll 的 copy 方法...如果我们在 b.dll 中使用 s 并调用 copy,则将调用 b.dll 中的 copy 方法。

这就是为什么西蒙在回答中说:

除非你能做到,否则糟糕的事情就会发生
始终保证您的整套
二进制文件都是用相同的构建的
工具链。

因为如果由于某种原因,字符串 s 的副本在 a.dll 和 b.dll 之间不同,就会发生奇怪的事情。更糟糕的是,如果 a.dll 和 b.dll 之间的字符串本身不同,并且其中一个的析构函数知道清理另一个忽略的额外内存……您可能很难追踪内存泄漏。也许更糟糕... a.dll 可能是针对完全不同版本的 STL(即 STLPort)构建的,而 b.dll 是使用 Microsoft 的 STL 实现构建的。

那么你应该做什么呢?在我们工作的地方,我们对每个 dll 的工具链和构建设置都有严格的控制。因此,当我们开发内部 dll 时,我们可以自由地传输 STL 模板。我们仍然会遇到一些问题,这些问题偶尔会出现,因为有人没有正确设置他们的项目。然而,我们发现 STL 的便利性值得解决偶尔出现的问题。

对于将 dll 暴露给第三方,那就完全是另一回事了。除非您想严格要求客户端进行特定的构建设置,否则您将希望避免导出 STL 模板。我不建议严格强制您的客户具有特定的构建设置...他们可能有另一个第三方工具,希望您使用完全相反的构建设置。

(1) 是的,我知道静态和本地变量是在 dll 加载/卸载时实例化/删除的。

The main thing to keep in mind is that dlls contain code and not memory. Memory allocated belongs to the process(1). When you instantiate an object in your process, you invoke the constructor code. During that object's lifetime you will invoke other pieces of code(methods) to work on that object's memory. Then when the object is going away the destructor code is invoked.

STL Templates are not explicitly exported from the dll. The code is statically linked into each dll. So when std::string s is created in a.dll and passed to b.dll, each dll will have two different instances of the string::copy method. copy called in a.dll invokes a.dll's copy method... If we are working with s in b.dll and call copy, the copy method in b.dll will be invoked.

This is why in Simon's answer he says:

Bad things will happen unless you can
always guarantee that your entire set
of binaries is all built with the same
toolchain.

because if for some reason, string s's copy differs between a.dll and b.dll, weird things will happen. Even worse if string itself is different between a.dll and b.dll, and the destructor in one knows to clean extra memory that the other ignores... you can have difficult to track down memory leaks. Maybe even worse... a.dll might have been built against a completely different version of the STL (ie STLPort) while b.dll is built using Microsoft's STL implementation.

So what should you do? Where we work, we have strict control over the toolchain and build settings for each dll. So when we develop internal dll's, we freely transfer STL templates around. We still have problems that on rare occasion crop up because someone didn't correctly setup their project. However we find the convenience of the STL worth the occasional problem that crops up.

For exposing dlls to 3rd parties, that's another story entirely. Unless you want to strictly require specific build settings from clients, you'll want to avoid exporting STL templates. I don't recommend strictly enforcing your clients to have specific build settings... they may have another 3rd party tool that expects you to use completely opposite build settings.

(1) Yes I know static and locals are instantiated/deleted on dll load/unload.

安静 2024-09-23 12:36:09

我在我正在处理的一个项目中遇到了这个问题 - STL 类经常与 DLL 之间传输。问题不仅仅是不同的内存堆 - 实际上 STL 类没有二进制标准 (ABI)。例如,在调试版本中,某些 STL 实现会向 STL 类添加额外的调试信息,例如 sizeof(std::vector) (release build) != sizeof(std ::vector)(调试构建)。哎哟!您不可能依赖这些类的二进制兼容性。此外,如果您的 DLL 是在不同的编译器中使用其他使用其他算法的 STL 实现进行编译的,那么您在发布版本中也可能具有不同的二进制格式。

我解决这个问题的方法是使用一个名为 pod 的模板类(POD 代表普通旧数据,如字符和整数,通常在 DLL 之间传输良好)。该类的工作是将其模板参数打包为一致的二进制格式,然后在另一端将其解包。例如,DLL 中的函数不返回 std::vector,而是返回 pod>。有一个针对 pod> 的模板专门化,它分配内存缓冲区并复制元素。它还提供了运算符 std::vector(),以便通过构造一个新向量,将其存储的元素复制到到它,并返回它。由于它始终使用相同的二进制格式,因此可以安全地将其编译为单独的二进制文件并保持二进制兼容。 pod 的替代名称可以是 make_binary_company

这是 pod 类定义:

// All members are protected, because the class *must* be specialization
// for each type
template<typename T>
class pod {
protected:
    pod();
    pod(const T& value);
    pod(const pod& copy);                   // no copy ctor in any pod
    pod& operator=(const pod& assign);
    T get() const;
    operator T() const;
    ~pod();
};

这是 pod 的部分特化 - 注意,使用部分特化,因此该类适用于任何类型的 T。另请注意,它实际上正在存储pod 的内存缓冲区而不仅仅是 T - 如果向量包含另一个 STL 类型,如 std::string,我们也希望它是二进制兼容的!

// Transmit vector as POD buffer
template<typename T>
class pod<std::vector<T> > {
protected:
    pod(const pod<std::vector<T> >& copy);  // no copy ctor

    // For storing vector as plain old data buffer
    typename std::vector<T>::size_type  size;
    pod<T>*                             elements;

    void release()
    {
        if (elements) {

            // Destruct every element, in case contained other cr::pod<T>s
            pod<T>* ptr = elements;
            pod<T>* end = elements + size;

            for ( ; ptr != end; ++ptr)
                ptr->~pod<T>();

            // Deallocate memory
            pod_free(elements);
            elements = NULL;
        }
    }

    void set_from(const std::vector<T>& value)
    {
        // Allocate buffer with room for pods of T
        size = value.size();

        if (size > 0) {
            elements = reinterpret_cast<pod<T>*>(pod_malloc(sizeof(pod<T>) * size));

            if (elements == NULL)
                throw std::bad_alloc("out of memory");
        }
        else
            elements = NULL;

        // Placement new pods in to the buffer
        pod<T>* ptr = elements;
        pod<T>* end = elements + size;
        std::vector<T>::const_iterator iter = value.begin();

        for ( ; ptr != end; )
            new (ptr++) pod<T>(*iter++);
    }

public:
    pod() : size(0), elements(NULL) {}

    // Construct from vector<T>
    pod(const std::vector<T>& value)
    {
        set_from(value);
    }

    pod<std::vector<T> >& operator=(const std::vector<T>& value)
    {
        release();
        set_from(value);
        return *this;
    }

    std::vector<T> get() const
    {
        std::vector<T> result;
        result.reserve(size);

        // Copy out the pods, using their operator T() to call get()
        std::copy(elements, elements + size, std::back_inserter(result));

        return result;
    }

    operator std::vector<T>() const
    {
        return get();
    }

    ~pod()
    {
        release();
    }
};

请注意,使用的内存分配函数是 pod_malloc 和 pod_free - 它们只是 malloc 和 free,但在所有 DLL 之间使用相同的函数。就我而言,所有 DLL 都使用 malloc 并从主机 EXE 中释放,因此它们都使用相同的堆,这解​​决了堆内存问题。 (具体如何解决这个问题取决于您。)

另请注意,您需要对 podpod 和 pod 进行专门化对于所有基本类型(podpod 等),以便它们可以存储在“pod 向量”和其他 pod 容器中。如果您理解上面的示例,那么这些应该很容易写。

该方法确实意味着复制整个对象。但是,您可以传递对 pod 类型的引用,因为二进制文件之间有一个安全的 operator=。不过,并没有真正的引用传递,因为更改 pod 类型的唯一方法是将其复制回其原始类型,更改它,然后重新打包为 pod。此外,它创建的副本意味着它不一定是最快的方法,但它有效

但是,您也可以对自己的类型进行 pod 专门化,这意味着您可以有效地返回复杂类型,例如 std::map> ,前提是有一个pod 的特化以及 std::mapstd::vectorstd::map 的部分特化code>std::basic_string (您只需编写一次)。

最终结果用法如下所示。定义了一个通用接口:

class ICommonInterface {
public:
    virtual pod<std::vector<std::string>> GetListOfStrings() const = 0;
};

DLL 可以这样实现它:

pod<std::vector<std::string>> MyDllImplementation::GetListOfStrings() const
{
    std::vector<std::string> ret;

    // ...

    // pod can construct itself from its template parameter
    // so this works without any mention of pod
    return ret;
}

调用者(一个单独的二进制文件)可以这样调用它:

ICommonInterface* pCommonInterface = ...

// pod has an operator T(), so this works again without any mention of pod
std::vector<std::string> list_of_strings = pCommonInterface->GetListOfStrings();

因此,一旦设置完毕,您就可以使用它,就像 pod 类不存在一样。

I have this exact problem in a project I'm working on - STL classes are transmitted to and from DLLs a lot. The problem isn't just the different memory heaps - it's actually that the STL classes have no binary standard (ABI). For example, in debug builds, some STL implementations add extra debugging information to the STL classes, such that sizeof(std::vector<T>) (release build) != sizeof(std::vector<T>) (debug build). Ouch! There's no hope you can rely on binary compatibility of these classes. Besides, if your DLL was compiled in a different compiler with some other STL implementation that used other algorithms, you might have different binary format in release builds, too.

The way I've solved this problem is by using a template class called pod<T> (POD stands for Plain Old Data, like chars and ints, which usually transfer fine between DLLs). The job of this class is to package its template parameter in to a consistent binary format, and then unpackage it at the other end. For example, instead of a function in a DLL returning a std::vector<int>, you return a pod<std::vector<int>>. There's a template specialization for pod<std::vector<T>>, which mallocs a memory buffer and copies the elements. It also provides operator std::vector<T>(), so that the return value can transparently be stored back in to a std::vector, by constructing a new vector, copying its stored elements in to it, and returning it. Because it always uses the same binary format, it can be safely compiled in to separate binaries and remain binary compatible. An alternative name for pod could be make_binary_compatible.

Here's the pod class definition:

// All members are protected, because the class *must* be specialization
// for each type
template<typename T>
class pod {
protected:
    pod();
    pod(const T& value);
    pod(const pod& copy);                   // no copy ctor in any pod
    pod& operator=(const pod& assign);
    T get() const;
    operator T() const;
    ~pod();
};

Here's the partial specialization for pod<vector<T>> - note, partial specialization is used so this class works for any type of T. Also note, it actually is storing a memory buffer of pod<T> rather than just T - if the vector contained another STL type like std::string, we'd want that to be binary compatible too!

// Transmit vector as POD buffer
template<typename T>
class pod<std::vector<T> > {
protected:
    pod(const pod<std::vector<T> >& copy);  // no copy ctor

    // For storing vector as plain old data buffer
    typename std::vector<T>::size_type  size;
    pod<T>*                             elements;

    void release()
    {
        if (elements) {

            // Destruct every element, in case contained other cr::pod<T>s
            pod<T>* ptr = elements;
            pod<T>* end = elements + size;

            for ( ; ptr != end; ++ptr)
                ptr->~pod<T>();

            // Deallocate memory
            pod_free(elements);
            elements = NULL;
        }
    }

    void set_from(const std::vector<T>& value)
    {
        // Allocate buffer with room for pods of T
        size = value.size();

        if (size > 0) {
            elements = reinterpret_cast<pod<T>*>(pod_malloc(sizeof(pod<T>) * size));

            if (elements == NULL)
                throw std::bad_alloc("out of memory");
        }
        else
            elements = NULL;

        // Placement new pods in to the buffer
        pod<T>* ptr = elements;
        pod<T>* end = elements + size;
        std::vector<T>::const_iterator iter = value.begin();

        for ( ; ptr != end; )
            new (ptr++) pod<T>(*iter++);
    }

public:
    pod() : size(0), elements(NULL) {}

    // Construct from vector<T>
    pod(const std::vector<T>& value)
    {
        set_from(value);
    }

    pod<std::vector<T> >& operator=(const std::vector<T>& value)
    {
        release();
        set_from(value);
        return *this;
    }

    std::vector<T> get() const
    {
        std::vector<T> result;
        result.reserve(size);

        // Copy out the pods, using their operator T() to call get()
        std::copy(elements, elements + size, std::back_inserter(result));

        return result;
    }

    operator std::vector<T>() const
    {
        return get();
    }

    ~pod()
    {
        release();
    }
};

Note the memory allocation functions used are pod_malloc and pod_free - these are simply malloc and free, but using the same function between all DLLs. In my case, all DLLs use the malloc and free from the host EXE, so they are all using the same heap, which solves the heap memory issue. (Exactly how you figure this out is down to you.)

Also note you need specializations for pod<T*>, pod<const T*>, and pod for all the basic types (pod<int>, pod<short> etc), so that they can be stored in a "pod vector" and other pod containers. These should be straightforward enough to write if you understand the above example.

This method does mean copying the entire object. You can, however, pass references to pod types, since there is an operator= which is safe between binaries. There's no real pass-by-reference, though, since the only way to change a pod type is to copy it out back to its original type, change it, then repackage as a pod. Also, the copies it creates mean it's not necessarily the fastest way, but it works.

However, you can also pod-specialize your own types, which means you can effectively return complex types like std::map<MyClass, std::vector<std::string>> providing there's a specialization for pod<MyClass> and partial specializations for std::map<K, V>, std::vector<T> and std::basic_string<T> (which you only need to write once).

The end result usage looks like this. A common interface is defined:

class ICommonInterface {
public:
    virtual pod<std::vector<std::string>> GetListOfStrings() const = 0;
};

A DLL might implement it as such:

pod<std::vector<std::string>> MyDllImplementation::GetListOfStrings() const
{
    std::vector<std::string> ret;

    // ...

    // pod can construct itself from its template parameter
    // so this works without any mention of pod
    return ret;
}

And the caller, a separate binary, can call it as such:

ICommonInterface* pCommonInterface = ...

// pod has an operator T(), so this works again without any mention of pod
std::vector<std::string> list_of_strings = pCommonInterface->GetListOfStrings();

So once it's set up, you can use it almost as if the pod class wasn't there.

随风而去 2024-09-23 12:36:09

我不确定“任何可以调用 new/delete 的东西”——这可以通过仔细使用共享指针等效项和适当的分配器/删除器函数来管理。

但一般来说,我不会跨 DLL 边界传递模板 - 模板类的实现最终出现在接口的两侧,这意味着你们都可以使用不同的实现。除非您始终能够保证整套二进制文件都是使用相同的工具链构建的,否则就会发生不好的事情。

当我需要此类功能时,我经常跨界使用虚拟接口类。然后,您可以为 std::stringlist 等提供包装器,以便您通过接口安全地使用它们。然后,您可以使用您的实现或使用 shared_ptr 来控制分配等。

说了这么多,我在 DLL 接口中使用的一件事是 shared_ptr,因为它太有用了。我还没有遇到任何问题,但一切都是用相同的工具链构建的。我正在等待它咬我,毫无疑问它会咬我。请参阅上一个问题:在dll接口中使用shared_ptr

I'm not sure about "anything that could call new/delete" - this can be managed by careful use of shared pointer equivalents with appropriate allocators/deleter functions.

However in general, I wouldn't pass templates across DLL boundaries - the implementation of the template class ends up in both sides of the interface which means you can both be using a different implementation. Bad things will happen unless you can always guarantee that your entire set of binaries is all built with the same toolchain.

When I need this sort of functionality I often use a virtual interface class across the boundary. You can then provide wrappers for std::string, list etc. that allow you to safely use them via the interface. You can then control allocation etc. using your implementation, or using a shared_ptr.

Having said all this, the one thing I do use in my DLL interfaces is shared_ptr, as it's too useful not to. I haven't yet had any problems, but everything is built with the same toolchain. I'm waiting for this to bite me, as no doubt it will. See this previous question: Using shared_ptr in dll-interfaces

涫野音 2024-09-23 12:36:09

对于std::string,您可以使用c_str返回。在更复杂的情况下,选项可以是类似“

class ContainerValueProcessor
    {
    public:
         virtual void operator()(const trivial_type& value)=0;
    };

Then”(假设您想使用 std::list),您可以使用一个接口

class List
    {
    public:
        virtual void processItems(ContainerValueProcessor&& proc)=0;
    };

。请注意,List 现在可以由任何容器实现。

For std::string you can return using c_str. In the case of more complicated stuff, an option can be something like

class ContainerValueProcessor
    {
    public:
         virtual void operator()(const trivial_type& value)=0;
    };

Then (assuming you want to use std::list), you can use an interface

class List
    {
    public:
        virtual void processItems(ContainerValueProcessor&& proc)=0;
    };

Notice that List can now be implemented by any container.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文