C 数据流库

发布于 2024-09-06 10:20:59 字数 687 浏览 12 评论 0原文

如何在 C 中进行数据流（管道和过滤器、流处理、基于流）？而 UNIX 管道则不然。

我最近遇到了 stream.py。

流是具有管道机制的可迭代对象，可实现数据流编程和轻松并行化。
这个想法是获取一个将一个可迭代对象转换为另一个可迭代对象的函数的输出，并将其作为另一个此类函数的输入。虽然您已经可以使用函数组合来做到这一点，但该包通过重载 >> 为其提供了一种优雅的表示法。运算符。

我想在 C 中复制此类功能的简单版本。我特别喜欢 >> 运算符的重载，以避免函数组合混乱。维基百科从新闻组指向此提示发表于 1990 年。

为什么是 C？因为我希望能够在微控制器和其他高级语言（Max、Pd*、Python）的 C 扩展中执行此操作。

*（讽刺的是，Max 和 Pd 是用 C 语言编写的，专门用于此目的 - 我正在寻找准系统的东西）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

纸短情长 2024-09-13 10:20:59

我知道，这不是一个好的答案，但您应该制作自己的简单数据流框架。

我（和我的一个朋友一起）编写了一个原型 DF 服务器，它有几个尚未实现的功能：它只能在消息中传递 Integer 和 Trigger 数据，并且不支持并行性。我刚刚跳过了这项工作：组件的生产者端口有一个指向消费者端口的函数指针列表，这些函数指针是在初始化时设置的，并且它们调用它（如果列表不为空）。因此，当事件触发时，组件会执行数据流图的树状遍历。当他们使用整数和触发器时，速度非常快。

另外，我编写了一个奇怪的组件，它有一个消费者和一个生产者端口，它只是简单地传递数据 - 但在另一个线程中。它的消费者例程很快完成，因为它只是放入数据并为生产者端线程设置一个标志。很脏，但它满足了我的需要：它分离了树上小道的漫长过程。

因此，正如您可能认识到的那样，它是一个用于快速任务的低流量异步系统，其中图形大小并不重要。

不幸的是，您的问题与我的问题有很多不同，就像许多数据流系统可能与另一个不同一样，您需要一个同步、并行、流处理解决方案。

我认为，DF 服务器中最大的问题是调度程序。并发、冲突、线程、优先级……正如我所说，我只是跳过了问题，没有解决。你也应该跳过它。而且您还应该跳过其他问题。

Dispatcher

对于同步 DF 架构，除特殊情况外，所有组件必须每个周期运行一次。他们有一个简单的前提条件：输入数据可用吗？因此，您应该只扫描组件，然后将它们传递给空闲的调用者线程（如果数据可用）。处理完所有这些之后，您将剩下 N 个尚未处理的组件。您应该再次处理该列表。第二次处理后，您将剩下 M 个。如果N == M，则循环结束。

我认为，如果组件数量低于 100，某种相同的东西将会起作用。

绑定是

的，最好的绑定方式是可视化编程。在完成编辑器之前，类似配置的代码应该使用 insetad，例如：

 // disclaimer: not actual code
 Component* c1 = new AddComponent();
 Component* c2 = new PrintComponent();
 c2->format = "The result is %d\n";
 bind(c1->result,c2->feed);

它很容易编写，可读性很好，还有其他愿望吗？

消息

您应该在组件的端口之间传递纯原始数据包。您只需要一个绑定列表，其中包含生产者和消费者端口的指针对，并包含“调度程序”使用的已处理标志。

调用问题

问题是生产者不应该调用消费者端口，而应该调用组件；所有组件（类）变量和触发都在组件中。因此，生产者应该直接调用组件的公共入口点，将消费者的 ID 传递给它，或者应该调用端口，该端口应该调用它所属组件的任何方法。

因此，如果您可以忍受一些限制，我建议您继续编写您的精简版框架。这是一项很好的任务，但是编写小组件并看看它们如何智能地连接在一起构建一个伟大的应用程序是终极的乐趣。

如果您还有其他问题，请随时提问，我经常在这里扫描“数据流”关键字。

也许，您可以为您的程序找出一个更简单的数据流模型。

I know, that it's not a good answer, but you should make your own simple dataflow framework.

I've written a prototype DF server (together with a friend of mine), which have several unimplemented features yet: it can only pass Integer and Trigger data in messages, and it does not supports paralellism. I've just skipped this work: the components' producer ports have a list of function pointers to consumer ports, which are set up upon the initialization, and they call it (if the list is not empty). So, when an event fires, the components perform a tree-like walk-thru of the dataflow graph. As they work with Integers and Triggers, it's extremly quick.

Also, I've written a strange component, which have one consumer and one producer port, it just simply passes the data thru - but in another thread. It's consumer routine finishes quickly, as it just puts the data and sets a flag to the producer-side thread. Dirty, but it suits my needs: it detaches long processes of the tree-walk.

So, as you may recognized, it's a low-traffic asynchronous system for quick tasks, where the graph size does not matter.

Unfortunatelly, your problem differs as many points from mine, just as many one dataflow system can differ from another, you need a synchronous, paralell, stream handling solution.

I think, the biggest issue in a DF server is the dispatcher. Concurrency, collision, threads, priority... as I said, I've just skipped the problem, not solved. You should skip it, too. And you also should skip other problems.

Dispatcher

In case of a synchronous DF architecture, all the components must run once per cycle, except special cases. They have a simple precondition: is the input data available? So, you should just to scan thru the components, and pass them to a free caller thread, if data is available. After processing all of them, you will have N remaining components, which haven't processed. You should process the list again. After the second processing you will have M remainings. If N == M, the cycle is over.

I think some kind of same stuff will work, if the number of components is below only 100.

Binding

Yep, the best way of binding is the visual programming. Until finishing the editor, config-like code should used insetad, something like:

 // disclaimer: not actual code
 Component* c1 = new AddComponent();
 Component* c2 = new PrintComponent();
 c2->format = "The result is %d\n";
 bind(c1->result,c2->feed);

It's easy to write, well-readable, other wish?

Message

You should pass pure raw packets among components' ports. You need only a list of bindings, which contain pairs of pointers of producer and consumer ports, and contains the processed flag, which the "dispatcher" uses.

Calling issue

The problem is that producer should not call the consumer port, but the component; all component (class) variables and firings are in the component. So, the producer should call the component's common entry point directly, passing the consumer's ID to it, or it should call the port, which should call any method of the component which it belongs.

So, if you can live with some restrictions, I say, go ahead, and write your lite framework. It's a good task, but writing small components and see, how smart can they wired together building a great app is the ultimate fun.

If you have further questions, feel free to ask, I often scan the "dataflow" keyword here.

Possibly, you can figure out a more simple dataflowish model for your program.

回复收藏 0 原文

勿忘初心 2024-09-13 10:20:59

这很酷：http://code.google.com/p/libconcurrency/

一个轻量级的 C 并发库，以对称协程作为主要的控制流抽象。该库与 State Threads 类似，但使用协程而不是绿色线程。这简化了过程间调用，并在很大程度上消除了对互斥体和信号量进行信号发送的需要。
最终，协程调用也将能够在内核线程之间安全地迁移，因此可实现的可扩展性因此远高于专门为单线程的状态线程。
该库的灵感来自 Douglas W. Jones 的“最小用户级线程包”。 svn trunk上的伪平台中立探测算法源自他的代码。
还有一种基于堆栈复制的更安全、更便携的协程实现，其灵感来自于 sigfpe 关于 C 中可移植延续的页面。复制比堆栈切换更便携、更灵活，并且正在研究使复制与切换竞争。

回复收藏 0 原文