检测 Scala 程序中函数更改的最佳实践?

发布于 2024-12-05 22:54:15 字数 722 浏览 2 评论 0原文

我正在开发一种基于 Scala 的脚本语言(内部 DSL),它允许用户在 Scala 脚本文件中定义多个数据转换函数。由于这些函数的应用可能需要几个小时,我想将结果缓存在数据库中。 用户可以更改转换函数的定义,也可以添加新函数。但是,然后用户使用稍微修改过的脚本重新启动应用程序,我只想执行那些已更改或添加的功能。问题是如何检测这些变化?为了简单起见,我们假设用户只能修改脚本文件,以便任何对此脚本中未定义的内容的引用都可以假定为未更改。

在这种情况下,检测此类用户定义函数的更改的最佳实践是什么?

到目前为止,我考虑的是:

  • 解析脚本文件并根据函数定义的源代码计算指纹,
  • 获取字节码每个函数在运行时并基于此数据构建指纹,
  • 将函数应用于一些测试数据并计算结果的指纹

但是,这三种方法都有其缺陷。

  • 为 Scala 编写解析器来提取函数定义可能需要相当多的工作,特别是如果您想要检测间接影响函数行为的更改(例如,如果您的函数调用脚本中定义的另一个(已更改的)函数)。
  • 字节码分析可能是另一种选择,但我从未使用过这些库。因此我不知道他们是否可以解决我的问题以及他们如何处理Java的动态绑定。
  • 使用示例数据的方法绝对是最简单的一种,但有一个缺点:如果不同的用户定义函数为我的测试数据返回相同的结果,则它们可能会意外映射到相同的指纹。

有人对这些“解决方案”之一有经验或者可以建议我更好的解决方案吗?

I'm working on a Scala-based script language (internal DSL) that allows users to define multiple data transformations functions in a Scala script file. Since the application of these functions could take several hours I would like to cache the results in a database.
Users are allowed to change the definition of the transformation functions and also to add new functions. However, then the user restarts the application with a slightly modified script I would like to execute only those functions that have been changed or added. The question is how to detect those changes? For simplicity let us assume that the user can only adapt the script file so that any reference to something not defined in this script can be assumed to be unchanged.

In this case what's the best practice for detecting changes to such user-defined functions?

Until now I though about:

  • parsing the script file and calculating fingerprints based on the source code of the function definitions
  • getting the bytecode of each function at runtime and building fingerprints based on this data
  • applying the functions to some test data and calculating fingerprints on the results

However, all three approaches have their pitfalls.

  • Writing a parser for Scala to extract the function definitions could be quite some work, especially if you want to detect changes that indirectly affect the behaviour of your functions (e.g. if your function calls another (changed) function defined in the script).
  • The bytecode analysis could be another option, but I never worked with those libraries. Thus I have no idea if they can solve my problem and how they deal with Java's dynamic binding.
  • The approach with example data is definitely the simplest one, but has the drawback that different user-defined functions could be accidentally mapped to the same fingerprint if they return the same results for my test data.

Does someone has experience with one of these "solutions" or can suggest me a better one?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

妄断弥空 2024-12-12 22:54:15

第二个选择看起来并不困难。例如,使用 Javassist 库获取方法的字节码就这么简单

CtClass c = ClassPool.getDefault().get(className);
for (CtMethod m: c.getDeclaredMethod()) {
    CodeAttribute ca = m.getMethodInfo().getCodeAttribute();
    if (ca != null) { // i.e. if the method is not native
        byte[] byteCode = ca.getCode();
        ...
    }
}

,只要您假设方法的结果仅取决于该方法的代码,那就非常简单了。

更新:
另一方面,由于您的方法是用 Scala 编写的,因此它们可能包含一些闭包,因此它们的部分代码驻留在匿名类中,并且您可能需要以某种方式跟踪这些类的使用情况。

The second option doesn't look difficult. For example, with Javassist library obtaining bytecode of a method is as simple as

CtClass c = ClassPool.getDefault().get(className);
for (CtMethod m: c.getDeclaredMethod()) {
    CodeAttribute ca = m.getMethodInfo().getCodeAttribute();
    if (ca != null) { // i.e. if the method is not native
        byte[] byteCode = ca.getCode();
        ...
    }
}

So, as long as you assume that results of your methods depend on the code of that methods only, it's pretty straighforward.

UPDATE:
On the other hand, since your methods are written in Scala, they probably contain some closures, so that parts of their code reside in anonymous classes, and you may need to trace usage of these classes somehow.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文