我可以使用 pthreads 分配每个线程索引吗？

发布于 2024-10-28 19:49:00 字数 828 浏览 1 评论 0原文

我正在为我的项目（Linux、ICC、pthreads）优化一些工具，并且希望获得有关此技术的一些反馈，以便为线程分配唯一索引，以便我可以使用它来索引每个线程数据的数组。

旧技术使用基于 pthread id 的 std::map ，但我想尽可能避免锁和映射查找（它会产生大量开销）。

这是我的新技术：

static PerThreadInfo info[MAX_THREADS]; // shared, each index is per thread

// Allow each thread a unique sequential index, used for indexing into per
// thread data.
1:static size_t GetThreadIndex()
2:{
3:   static size_t threadCount = 0;
4:   __thread static size_t myThreadIndex = threadCount++;
5:   return myThreadIndex;
6:}

稍后在代码中：

// add some info per thread, so it can be aggregated globally
info[ GetThreadIndex() ] = MyNewInfo();

所以：

1）如果两个线程完全同时创建，则第 4 行看起来可能是竞争条件。如果是这样 - 我怎样才能避免这种情况（最好没有锁）？我看不出原子增量在这里有什么帮助。

2）是否有更好的方法来创建每线程索引？也许通过某种方式在线程创建时预先生成 TLS 索引？

原文

I'm optimizing some instrumentation for my project (Linux,ICC,pthreads), and would like some feedback on this technique to assign a unique index to a thread, so I can use it to index into an array of per-thread data.

The old technique uses a std::map based on pthread id, but I'd like to avoid locks and a map lookup if possible (it is creating a significant amount of overhead).

Here is my new technique:

static PerThreadInfo info[MAX_THREADS]; // shared, each index is per thread

// Allow each thread a unique sequential index, used for indexing into per
// thread data.
1:static size_t GetThreadIndex()
2:{
3:   static size_t threadCount = 0;
4:   __thread static size_t myThreadIndex = threadCount++;
5:   return myThreadIndex;
6:}

later in the code:

// add some info per thread, so it can be aggregated globally
info[ GetThreadIndex() ] = MyNewInfo();

So:

1) It looks like line 4 could be a race condition if two threads where created at exactly the same time. If so - how can I avoid this (preferably without locks)? I can't see how an atomic increment would help here.

2) Is there a better way to create a per-thread index somehow? Maybe by pre-generating the TLS index on thread creation somehow?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

太傻旳人生 2024-11-04 19:49:00

1) 原子增量实际上会有所帮助，因为可能的竞争是两个线程读取并向自己分配相同的 ID，因此确保增量（读取数字，加 1，存储数字）以原子方式发生修复该竞争条件。在 Intel 上，“lock; inc”就可以解决问题，或者您的平台提供的任何功能（例如 Windows 的 InterlockedIncrement()）。

2）好吧，您实际上可以将整个信息设置为线程本地的（“__thread static PerThreadInfo info;”），前提是您的唯一目标是能够轻松地并在通用名称下访问每个线程的数据。如果您确实希望它成为全局可访问的数组，那么像使用 TLS 一样保存索引是一种非常简单且有效的方法。正如 Kromey 在他的文章中指出的那样，您还可以预先计算索引并在线程创建时将它们作为参数传递。

回复收藏 0 原文

情释 2024-11-04 19:49:00

为什么如此反对使用锁？解决竞争条件正是它们的设计目的......

无论如何，您可以使用 pthread_create() 中的第四个参数将参数传递给线程的启动例程；通过这种方式，您可以使用主进程在启动线程时生成一个递增计数器，并在创建线程时将此计数器传递到每个线程中，从而为每个线程提供唯一的索引。

回复收藏 0 原文

何处潇湘 2024-11-04 19:49:00

我知道您标记了这个 [pthreads]，但您也提到了使用 std::map 的“旧技术”。这让我相信您正在使用 C++ 进行编程。在 C++11 中，您有 std::thread，并且您可以在线程创建时通过普通函数参数向线程传递唯一索引 (id)。

下面是一个创建 N 个线程的 HelloWorld 示例，为每个线程分配 0 到 N-1 的索引。每个线程除了说“嗨”并给出它的索引之外什么也不做：

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>

inline void sub_print() {}

template <class A0, class ...Args>
void
sub_print(const A0& a0, const Args& ...args)
{
    std::cout << a0;
    sub_print(args...);
}

std::mutex&
cout_mut()
{
    static std::mutex m;
    return m;
}

template <class ...Args>
void
print(const Args& ...args)
{
    std::lock_guard<std::mutex> _(cout_mut());
    sub_print(args...);
}

void f(int id)
{
    print("This is thread ", id, "\n");
}

int main()
{
    const int N = 10;
    std::vector<std::thread> threads;
    for (int i = 0; i < N; ++i)
        threads.push_back(std::thread(f, i));
    for (auto i = threads.begin(), e = threads.end(); i != e; ++i)
        i->join();
}

我的输出：

This is thread 0
This is thread 1
This is thread 4
This is thread 3
This is thread 5
This is thread 7
This is thread 6
This is thread 2
This is thread 9
This is thread 8

I know you tagged this [pthreads], but you also mentioned the "old technique" of using std::map. This leads me to believe that you're programming in C++. In C++11 you have std::thread, and you can pass out unique indexes (id's) to your threads at thread creation time through an ordinary function parameter.

Below is an example HelloWorld that creates N threads, assigning each an index of 0 through N-1. Each thread does nothing but say "hi" and give it's index:

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>

inline void sub_print() {}

template <class A0, class ...Args>
void
sub_print(const A0& a0, const Args& ...args)
{
    std::cout << a0;
    sub_print(args...);
}

std::mutex&
cout_mut()
{
    static std::mutex m;
    return m;
}

template <class ...Args>
void
print(const Args& ...args)
{
    std::lock_guard<std::mutex> _(cout_mut());
    sub_print(args...);
}

void f(int id)
{
    print("This is thread ", id, "\n");
}

int main()
{
    const int N = 10;
    std::vector<std::thread> threads;
    for (int i = 0; i < N; ++i)
        threads.push_back(std::thread(f, i));
    for (auto i = threads.begin(), e = threads.end(); i != e; ++i)
        i->join();
}

My output:

This is thread 0
This is thread 1
This is thread 4
This is thread 3
This is thread 5
This is thread 7
This is thread 6
This is thread 2
This is thread 9
This is thread 8

回复收藏 0 原文

~没有更多了~