处理与 tbb 连续的大数据

发布于 2024-11-09 11:15:48 字数 406 浏览 7 评论 0原文

我正在开发 C++ 应用程序来处理大量报价数据,例如。 (MSFT、AMZN 等)待定。我想知道我将如何构建它。我一直在研究parallel_for、pipeline和concurrent_queue。

该过程基本上会解析数据、处理数据并输出到文件。解析和处理可以并行完成,但每个符号的输出应该按顺序进行。

Eg. Input:
    - Msg #1 - AMZN #1
    - Msg #2 - AMZN #2
    - Msg #3 - IBM #1
    - Msg #4 - AMZN #3
    - Msg #5 - CSCO #1
    - Msg $6 - IBM #2

我想使用无锁解决方案或最小锁定,但似乎我已保留在并发队列中以保持顺序。

任何想法都会有帮助

谢谢, 大卫

I'm working on c++ app to process large amounts of quote data eg. (MSFT, AMZN, etc) with tbb. And was wondering how I would structure it. I'm been looking at parallel_for and pipeline and concurrent_queue.

The process would basically parse the data, process it and output to file. Parsing and processing can be done in parallel, but output should be in order for each symbol.

Eg. Input:
    - Msg #1 - AMZN #1
    - Msg #2 - AMZN #2
    - Msg #3 - IBM #1
    - Msg #4 - AMZN #3
    - Msg #5 - CSCO #1
    - Msg $6 - IBM #2

I would like to use lock-free solution or minimum locking, but it seems like I have keep in concurrent_queue to keep the order.

Any ideas would be helpful

Thanks,
David

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

じ违心 2024-11-16 11:15:48

如果您使用管道模式(tbb::pipeline 类或 tbb::parallel_pipeline() 函数),您可以使用有序过滤器来确保输出准确地出现在与收到输入的顺序相同。并且您不需要在代码中加任何锁来进行订购。

If you use the pipeline pattern (tbb::pipeline class or tbb::parallel_pipeline() function), you can use ordered filters to ensure the output will appear in exactly the same order as the input was received. And you will not need any locks in your code for ordering.

没企图 2024-11-16 11:15:48

您的报价数据是否有时间戳或序列号
否则,从生产者线程添加序列号,并在解析数据后根据序列号对数据进行排序 - 然后可以批量或在写入文件之前进行重新排序。

Does your quote data either have a timestamp or a sequence number
Otherwise add a sequence number from the producer thread and sort the data based on squence number after parsing it - the resorting can be done then either in a batch or just before the writing of the files.

月下伊人醉 2024-11-16 11:15:48

您可以创建一个输出结构(散列或列表),其中键是显示元素的位置(第一个、第二个……),值是要显示的数据。然后,当所有元素准备就绪时,您可以按所需顺序输出结构。

这样你就不用关心哪个线程先完成。

You can create an output structure (hash or list) where a key is a position of the displayed element (1st, 2nd, ...) and the value is the data to be displayed. Then when all the elements are ready, you can output the structure in the desired order.

This way you don't care about which thread finishes first.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文