链式数据计算。 品牌和替代品?

发布于 2024-07-17 06:54:38 字数 1436 浏览 6 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

吃颗糖壮壮胆 2024-07-24 06:54:38

假设可以发现数据库记录较新,那么应该可以编写一个程序,将哨兵文件的日期设置为最新数据记录的日期(或“现在”,如果更简单的话)相关源表。 对每个数据库或查询执行此操作将为您提供哨兵文件的集合,这些文件可以与现有的 CSV 源文件一起使用,以提供依赖关系树并通过标准 make 驱动整个计算。

在每次构建时更新哨兵的一个简单答案是使用一个构建脚本,该脚本运行数据代理生成器,然后用 make 代替 make 命令本身。

应该可以安排 make 自动更新哨兵,作为正常依赖性检查的一部分。 类似以下内容(未经测试)应该可以解决问题:

all: result
clean:
        -rm table_*.txt
        -rm step*.csv
        -rm results.txt

results.txt: step2.csv
        write_report -o results.txt step2.csv

step1.csv: source.csv table_A.txt 
        do_step1 -o step1.csv source.csv

step2.csv: step1.csv table_B.txt
        do_step2 -o step2.csv

table_A.txt:
        touch_sentinel -o table_A.txt rawdata.sqlite A 

table_B.txt:
        touch_sentinel -o table_B.txt otherdata.sqlite B 

其中 touch_sentinel 创建一个自数据库中表的最新更新以来的输出文件。 确定如何学习日期对读者来说是一项练习......

Assuming it is possible to discover that the database records are newer, then it should be possible to write a program that sets the date of a sentinel file to the date of the newest data record (or "now", if that is simpler) in the relevant source tables. Doing that for each database or query will give you a collection of sentinel files that can be used along with your existing CSV source files to feed the dependency tree and drive the whole calculation with standard make.

One easy answer to getting the sentinels updated on every build would be to use a build script that runs the data proxy generator followed by make in place of just the make command itself.

It should be possible to arrange for make to automatically update the sentinels as part of the normal dependency checks. Something like the following (untested) should do the trick:

all: result
clean:
        -rm table_*.txt
        -rm step*.csv
        -rm results.txt

results.txt: step2.csv
        write_report -o results.txt step2.csv

step1.csv: source.csv table_A.txt 
        do_step1 -o step1.csv source.csv

step2.csv: step1.csv table_B.txt
        do_step2 -o step2.csv

table_A.txt:
        touch_sentinel -o table_A.txt rawdata.sqlite A 

table_B.txt:
        touch_sentinel -o table_B.txt otherdata.sqlite B 

where touch_sentinel creates an output file dated since the latest update to a table in a database. Determining how to learn the date is an exercise to the reader...

巴黎夜雨 2024-07-24 06:54:38

我想到了一些替代方案:

  • Ant 对使用 Java 自定义依赖项有很好的支持。
  • SCons 允许您使用 Python 编写自定义依赖项代码。

Some alternatives that spring to mind:

  • Ant has pretty nice support for customizing dependencies using Java.
  • SCons allows you to write custom dependency code using Python.
把人绕傻吧 2024-07-24 06:54:38

另外两个替代方案是

  • Jam,boost 项目使用它,而
  • QMake 则由 QT

Two other alternatives are

  • Jam the boost project uses it and
  • QMake used by the QT
毁梦 2024-07-24 06:54:38

Rake 是面向依赖编程的 Ruby 实现,深受 Make 和 Ant 的启发,但是 更干净、更好用。

最近,出现了一个新人,叫做Tap。 它还允许面向依赖的编程,但通过工作流等概念对其进行了扩展。 它是由在生物分子研究实验室工作的生物化学博士生设计的,专门用于完成您提到的事情:保持从实验中获得的科学数据最新。

Rake is a Ruby implementation of Dependency-Oriented Programming that is heavily inspired by Make and Ant, but much cleaner and nicer to use.

Recently, there has been a newcomer on the scene, which is called Tap. It also allows Dependency-Oriented Programming but extends it with concepts such as Workflows. It was designed by a PhD biochemistry student who works in a biomolecular research lab, specifically to do exactly the things you mention: keeping scientific data derived from experiments up to date.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文