Clojure中的功能数据处理

发布于 2025-02-02 03:12:36 字数 608 浏览 3 评论 0原文

我正在寻找一些建筑解决方案,以解决我面临的问题。 我正在用一条处理数据(这是一个非常复杂的结构)来处理的巨大管道。通常,它可以说明为:

(defn process [data]
  (-> data
    do-something-1
    do-something-2
    do-something-3
    do-something-4
    ...
    do-something-20
)

do-something-*函数的每个地方可能同样复杂。 我的问题是,在这样的处理链中,功能之间存在很多耦合。例如,do-something-3将一些内容添加到do-something-9的要求中,该添加了do-do-something-themething-18 << /代码>等等。因此,在沿螺纹宏级联时,所有这些函数本质上都会富含数据。很难跟踪正在发生的事情和何时发生的事情。将整个加工链握在我的脑海中太多的认知负荷(或者至少我的脑袋里的RAM太少)。 如何处理此类案件?我知道没有银色的子弹,但也许我缺少一些东西(几个月前我开始学习Clojure)。

I'm looking for some architectural solution to the problem I'm facing.
I am struggling with a huge pipeline that processes data (which is a very complex structure). Generally, it can be illustrated as:

(defn process [data]
  (-> data
    do-something-1
    do-something-2
    do-something-3
    do-something-4
    ...
    do-something-20
)

where each of the do-something-* functions might be similarly complex.
My problem is that there is a lot of coupling between functions in such a processing chain. For example do-something-3add something to data that later is required by do-something-9 which adds something else required by do-something-18 and so on. So essentially data is enriched by all those functions when cascading down the threading macro. It's very hard to keep the track of what is happening and when. Holding the whole processing chain in my head is just too much of a cognitive load (or at least I have too little RAM in my head).
How do handle such cases? I get that there is no silver bullet but maybe there is something I'm missing (I started to learn clojure few months ago).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

眼趣 2025-02-09 03:12:36

由于您遇到的困难,我从不喜欢您所描述的长管道。

作为第一步,您可能只是将管道分解并在每个阶段命名(即使是临时):

(defn process [data]
  (let [x01 (do-something-1 data)
        x02 (do-something-2 x01)
        x03 (do-something-3 x02)
        x04 (do-something-4 x03)
        ; ...
        x20 (do-something-20 x19)]
    x20))

然后,您可以在必要时添加调试和/或验证语句。我还建议使用羽毛状架构记录每个do-to-pothing的输入/输出 - *函数(或者也许是malli;我不喜欢Spec)。希望功能名称也具有描述性;您可以将这些改善为另一个优点。

您还可以将功能分组为层次结构,而不仅仅是线性链:

A
 - A1
 - A2
 - A3
 - A4
B
 - B1 
 - B2
 - B3
C 
 ...etc...

因此,顶级管道仅为3-4个调用,每个呼叫都可能具有2-5个子电话。

当然,我希望您对各个级别的每个功能都有一个不错的单元测试,以记录预期的行为和典型的输入/输出。

I have never liked long pipelines as you describe because of the difficulties you are encountering.

As a first step, you might just break up the pipeline and give a name (even if temporary) to each stage:

(defn process [data]
  (let [x01 (do-something-1 data)
        x02 (do-something-2 x01)
        x03 (do-something-3 x02)
        x04 (do-something-4 x03)
        ; ...
        x20 (do-something-20 x19)]
    x20))

Then, you could add debug and/or validation statements between as necessary. I would also suggest using Plumatic Schema to document the input/output of each do-something-* function (or maybe Malli; I don't like spec). Hopefully the function names are also descriptive; you could improve those as another plus.

You could also group the functions in a hierarchy instead of a just a linear chain:

A
 - A1
 - A2
 - A3
 - A4
B
 - B1 
 - B2
 - B3
C 
 ...etc...

So the top-level pipeline is only 3-4 calls, each of which may have 2-5 sub-calls.

Of course, I hope you have nice unit tests for each function at all levels to document the expected behavior and typical inputs/outputs.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文