通用数据结构描述语言

发布于 2024-08-27 09:05:48 字数 370 浏览 10 评论 0原文

我想知道是否存在任何声明性语言用于任意描述数据结构的格式和语义,可以将其编译为任何一组目标语言中该结构的特定实现。也就是说,类似通用的数据定义语言,但旨在描述任意数据结构,例如向量,列表、树等,以及这些结构上的操作的语义。我问这个问题是因为我有一个可行的实现这个概念的想法,我只是想知道它是否值得,以及因此,以前是否已经做过。

另一个稍微更抽象的问题:数据结构的规范规范(它做什么)和它的实现(它如何做)之间是否有任何真正的区别?更具体地说,相同需求的单独实现是否应该被视为不同的结构

I am wondering whether there exists any declarative language for arbitrarily describing the format and semantics of a data structure, that can be compiled to a specific implementation of that structure in any of a set of target languages. That is, something like a generic data definition language but geared toward describing arbitrary data structures such as vectors, lists, trees, etc., and the semantics of operations on those structures. I ask because I had an idea for a feasible implementation of this concept, and I'm just wondering whether it's worth it, and, consequently, whether it's been done before.

Another, slightly more abstract question: is there any real difference between the normative specification of a data structure (what it does) and its implementation (how it does it)? More specifically, should separate implementations of the same requirements be considered different structures?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

初见终念 2024-09-03 09:05:48

您可能对消息传递规范/数据序列化语言感兴趣,例如 Google 的 Protocol Buffers 以及 ASN.1。这与您正在寻找的倾斜略有不同,但本质相同。

两者都是声明通信通用消息的方法。协议缓冲区消息规范“编译”为不同的语言,但中心协议是一致的。 ASN.1 具有多个不同的编译实用程序以及不同的协议表示,与不同的文字实现保持逻辑一致。例如,看看 XER、PER 与 BER。

我喜欢一种规范语言,它只专注于针对逻辑内存结构的简单打包二进制布局。简单的 C 结构可能是表达这一点的最简单的常用方式。我曾希望 ASN.1 能够通过某种方式实现这一目标,但经过一番研究后,ASN.1 PER 很接近,但又不完全是这样。

编辑: Apache ThriftCapn' Proto 也可能很有趣。

You may be interested in messaging specification / data serialization languages such as Google's Protocol Buffers as well as ASN.1. It's a slightly different slant than you're looking for, but in the same vein.

Both are ways of declaring generic messages for communications. Protocol buffers message specs "compile" to different languages, but the central protocol is consistent. ASN.1 has multiple different compilation utilities as well as different protocol representations staying logically consistent with varying literal implementations. Look at XER, PER vs. BER for example.

I'd love a specification language that would just focus on simple packed binary layout against a logical memory structure. It may be that plain C structs are the simplest common way of expressing this. I had hoped ASN.1 had some way of getting to that, but after looking at it for a bit, ASN.1 PER is close, but not quite it.

Edit: Apache Thrift and Capn' Proto may also be interesting.

笑红尘 2024-09-03 09:05:48

如果您愿意,XML 与 XSLT 的组合可以描述数据结构,并以您选择的任何语言提供匹配的定义。我从未试图正式证明这一点,但我的第一个猜测是 S 表达式是 XML 的超集(模语法差异)。

至少在理论上,对数据结构的作用及其实现方式的描述之间确实存在(或至少可能存在)差异。举一个明显的例子,您可以描述从键到值的通用映射,它可以使用基于哈希表、跳跃列表、二叉搜索树等的实现。这主要是在足够高的抽象级别上描述它的问题允许实施中的差异。如果您包含太多要求(复杂性、顺序等),您可以很快排除许多实现。

If you felt like it, a combination of XML with XSLT could describe a data structure, and provide a matching definition in essentially any language if your choice. I've never tried to prove it formally, but my first guess would be that S-expressions are a superset of XML (modulo syntactical differences).

At least in theory, yes there are (or at least can be) differences between a description of what a data structure does, and how it does it. For an obvious example, you could describe a generic mapping from keys to values, which could use an implementation based on hash tables, skip lists, binary search trees, etc. It's mostly a question of describing it at a high enough level of abstraction to allow differences in the implementation. If you include too many requirements (complexity, ordering, etc.) you can pretty quickly rule out many implementations.

西瓜 2024-09-03 09:05:48

Cozy 是“一种根据非常高级的规范综合数据结构实现的工具”,并且似乎本质上是当我问这个问题时,我实际上正在寻找(或考虑写作)的语言。

它可以根据用其专有语言编写的数据结构规范自动生成实现(在撰写本文时使用 Java 或 C++)。规范描述了数据结构的抽象状态更新操作查询操作,以及必须维护的不变量和假设求解器可以使用它来优化实现。例如,以下是图数据结构的部分规范:

Graph:
    handletype Node = { id : Int }
    handletype Edge = { src : Int, dst : Int }

    state nodes : Bag<Node>
    state edges : Bag<Edge>

    // Invariant: disallow self-edges.
    invariant (sum [ 1 | e <- edges, e.val.src == e.val.dst ]) == 0;

    op addNode(n : Node)
        nodes.add(n);

    op addEdge(e : Edge)
        assume e.val.src != e.val.dst;
        edges.add(e);

    query out_degree(nodeId : Int)
        sum [ 1 | e <- edges, e.val.src == nodeId ]

    // …

其实现在 快速集合的快速合成中描述广义数据结构综合作者:卡尔文·隆卡里克、埃米娜·托拉克和迈克尔·D·恩斯特。

Cozy is “a tool that synthesizes data structure implementations from very high-level specifications” and seems to be essentially the language I was actually looking for (or considering writing) when I asked this question.

It can automatically generate an implementation (in Java or C++, at the time of this writing) from a data structure specification written in its proprietary language. A specification describes the abstract state, update operations, and query operations of a data structure, as well as invariants that must be maintained and assumptions that can be used by the solver to optimise the implementation. For example, here is a partial specification for a graph data structure:

Graph:
    handletype Node = { id : Int }
    handletype Edge = { src : Int, dst : Int }

    state nodes : Bag<Node>
    state edges : Bag<Edge>

    // Invariant: disallow self-edges.
    invariant (sum [ 1 | e <- edges, e.val.src == e.val.dst ]) == 0;

    op addNode(n : Node)
        nodes.add(n);

    op addEdge(e : Edge)
        assume e.val.src != e.val.dst;
        edges.add(e);

    query out_degree(nodeId : Int)
        sum [ 1 | e <- edges, e.val.src == nodeId ]

    // …

Its implementation is described in Fast Synthesis of Fast Collections and Generalized Data Structure Synthesis by Calvin Loncaric, Emina Torlak, and Michael D. Ernst.

好菇凉咱不稀罕他 2024-09-03 09:05:48

动态逻辑中有一些解决此类问题的方法,它们试图捕获程序的语义。然而,动态逻辑的含义是先决条件和后置条件,并且与列表的实际实现无关。

这些数据结构本质上与实现相关,因为链表和数组之间的唯一区别在于它在内存中的布局方式。

为此,有一种通用数据定义语言——任何高级编程语言——C、C++、java——来指定这一点。它们中的任何一个都与另一个一样通用,因为在这种情况下,它们中的任何一个都可以编译为另一个。

There are approaches to that sort of thing in dynamic logics, which attempt to capture the semantics of programs. However, the meaning in terms of the dynamic logic is in terms of preconditions and postconditions and is agnostic with regard to the actual implementation of the list.

These data structures are inherently tied to implementation, as the only difference between a linked list and array is specifically how it is laid out in memory.

For this, there is a generic data definition language --- any high level programming language -- C, C++, java -- that specifies this. Any of them is as generic as the other, since within this context any of them could be compiled to the other.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文