Lisp 用于列表处理。有树处理语言吗?

发布于 2024-09-19 18:25:33 字数 685 浏览 9 评论 0原文

Lisp 的名称源自LISt Processing。链表是 Lisp 语言的主要数据结构,Lisp 源代码本身就是由列表组成的。因此,Lisp 程序可以将源代码作为数据结构进行操作(这称为同像性)。

然而,根据定义,列表是一个顺序结构。这鼓励我们使用顺序语言习语(一次处理一件事并累积结果的算法)来解决问题。例如,在 Lisp 中,cons 单元用于实现单链表,car 操作返回列表的第一个元素,而 cdr em> 返回列表的其余部分。我的愿景是一种用于并行执行的函数式语言,它将问题分解为大致相等的子问题,递归地解决它们,并组合子解决方案。

几乎每种编程语言的代码都已经被解析成树;有没有像 Lisp 这样的同形语言,但以树为主要数据结构?顺便说一句,我将其称为 Treep,用于 TREE P 处理。

更新:Guy Steele 于 2009 年发表的关于并行算法和并行计算的有趣演示 (PDF)数据结构,组织并行执行的功能代码:或者,foldlfoldr 被认为有轻微危害

The name for Lisp derives from LISt Processing. Linked lists are the major data structure of Lisp languages, and Lisp source code is itself made up of lists. As a result, Lisp programs can manipulate source code as a data structure (this is known as homoiconicity).

However, a list is by definition a sequential construct. This encourages us to solve problems using sequential language idioms (algorithms that process one thing at a time and accumulate the results). For example, in a Lisp where cons cells are used to implement singly-linked lists, the car operation returns the first element of the list, while cdr returns the rest of the list. My vision is of a functional language for parallel execution, that splits problems into roughly equal sub-problems, recursively solves them, and combines the sub-solutions.

Pretty much every programming language's code is already parsed into trees; is there a homoiconic language like Lisp, but with trees as the major data structure? btw, I'd call it Treep, for TREE Processing.

Update: An interesting presentation (PDF) from 2009 by Guy Steele on parallel algorithms & data structures, Organizing Functional Code for Parallel Execution: or, foldl and foldr Considered Slightly Harmful.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

狼性发作 2024-09-26 18:25:33

Lisp 列表是树,而 Lisp 代码也是一棵树,就像任何其他代码一样。

(+ (* 1 3) (* 4 6))

是一棵树:

     +
    / \
   /   \
   *   *
  / \ / \
  1 3 4 6

而且它不仅仅是二叉树。

(+ 1 2 3)

   +
  /|\
 / | \
1  2  3

所以,也许 Lisp 就是你的答案,也是你的问题。

Lisp Lists ARE trees, and Lisp code is a tree, just like any other code.

(+ (* 1 3) (* 4 6))

is a tree:

     +
    / \
   /   \
   *   *
  / \ / \
  1 3 4 6

And it's not just binary trees.

(+ 1 2 3)

   +
  /|\
 / | \
1  2  3

So, perhaps Lisp is your answer as well as your question.

请持续率性 2024-09-26 18:25:33

我不认为这种变化会非常深远。 Lisp 在让列表成为其他列表的成员方面当然没有任何问题,因此它可以轻松地表示树以及树上的算法。

相反,每个列表都可以被视为特定形状的树(以各种方式)。

I don't see that the change would be very profound. Lisp certainly doesn't have any problem with letting lists be members of other lists, so it can easily represent trees, and algorithms on trees.

Conversely, every list can be regarded as a tree of a particular shape (in various ways).

め可乐爱微笑 2024-09-26 18:25:33

我想说 Lisp 语言的主要数据结构是 cons cell。您可以使用 cons 单元轻松构建的东西之一是链接列表,但这绝不是唯一的数据结构。

cons 单元格是一对数据项,但没有任何内容表明值必须位于左侧单元格中,而指针必须位于右侧单元格中(如在链接列表中)。如果允许两个单元格本身包含值或指针,则可以轻松构建二元(或需要更多工作的 n 元)树结构。基于这些结构,人们可以构建字典或 B 树或您可能想到的任何其他数据结构。

I would say that the major data structure of Lisp languages is the cons cell. One of the things you can easily build with cons cells is a linked list, but that's by no means the only data structure.

A cons cell is a pair of data items, but there's nothing that says a value has to be in the left cell and a pointer in the right cell (as in a linked list). If you allow both cells to contain either values or pointers themselves, it's easy to build binary (or with a bit more work, n-ary) tree structures. Building upon these structures, one can build dictionaries or B-trees or any other data structure you might think of.

孤独难免 2024-09-26 18:25:33

OP 似乎对进程树具有一定并行性支持的语言感兴趣。

我们的DMS 软件重组工具包是一个通用程序分析和转换引擎。它将编程语言源文本解析为树(OP 的第一个兴趣),能够分析和处理这些树,并从这些树重新生成源文本。
DMS 由描述正在处理的语言的显式语法驱动,并且它有许多可用的生产质量语言定义。

DMS 的树处理机制在两个不同级别上提供对并行性的支持,其中之一由 DMS 的底层并行编程语言直接支持,PARLANSE,它受到 LISP 的启发,但没有那么动态。

PARLANSE 提供了称为“谷物”的并行活动“团队”(就像海滩上的沙子一样,其想法是有很多谷物)。团队可以通过(经典)分叉新颗粒并将其添加到(动态)团队来动态构建;这很有用,但速度不是特别快。团队可以静态地构建,包括“将这个固定大小的grain集分叉为纯并行”:

(|| a b c)

以及“创建一组其执行顺序由指定的部分顺序控制的grain”(!| ...)。您直接在 fork 调用中编写偏序:

(!| step1 a
    step2 b
    step3 (>> step1 step2) c
    step4 (>> step2) d )

它编码了操作 c 发生在操作 a< 之后(稍后 >> 在时间上))操作 a< 的事实/strong> 和 b 完成。静态形成的团队由 PARLANSE 编译器预编译成非常高效的结构,因为颗粒可以非常快速地启动和终止,从而允许非常小的颗粒尺寸(几百条指令)。标准并行库的开销比这高得多。

处理树的基本方法非常传统:有一个 PARLANSE 工具库,用于检查树节点、在树上上下移动、创建新节点并将它们拼接到位。递归过程通常用于访问/更新树。

这里的关键观察是,访问某些子集的树访问可以按顺序编码,或者很容易编码为静态并行团队。因此,手动编码并行树访问非常容易,但代价是为每种树节点类型编写大量特殊情况进行并行分叉。 OP似乎对“分而治之”感兴趣。 (当然,您可以枚举节点的子节点,并使用动态团队为每个子节点分叉谷物,但这并不那么快)。这是 DMS 中用于处理树的第一级并行性。

第二级并行性通过 DMS 提供的 DSL 来实现,该 DSL 实现了属性语法 (AG)。

AG 是用一组计算来修饰 BNF 的函数式语言。可以在 DMS 中编写一个带有 AG 的简单计算器:

   sum = sum + product;
        <<Calculator>>:  sum[0].result = sum[1].result + product.result;

这会导致通过组合第一个子项 (sum[1]) 和第二个子项的结果属性来计算根 (sum[0]) 的属性“结果”孩子(产品.结果)。
所谓的综合属性将信息从叶子向上传播到树上。继承的属性从父母那里传播信息。一般的属性语法和 DMS 的 AG 允许混合这些,因此信息可以以任意顺序在树上上下流动。

大多数AG是纯功能性的,例如没有副作用; DMS 允许产生副作用,这会使符号变得复杂,但在实践中非常有用。一个很好的例子是通过属性评估构建符号表;当前作用域沿树向下传递,本地声明块创建新的当前作用域并将其向下传递,各个声明将符号表数据存储到从父级接收的符号表条目中。

DMS 的 AG 编译器分析属性计算,确定信息如何在整个树中流动,然后生成并行 PARLANSE 程序来实现每个树节点类型的信息流。它可以对每个 AG 规则进行数据流分析,以确定信息流以及首先发生、稍后发生、最后发生的情况。对于上面的简单求和规则,应该清楚的是,必须先计算子级的属性,然后才能计算根的属性。事实证明,PARLANSE 的偏序构造是编码此信息的完美方式,甚至可以很好地处理我们添加到 AG 的副作用。

结果是 DMS 将 AG 规范编译为高度并行的 PARLANSE 程序。我们的 C++ 前端名称/类型解析器被实现为大约 150K 行的 DMS AG(是的,需要这么多行来描述如何使用 C++ 类型信息),它编译
大约 700K SLOC 的并行 PARLANSE 代码。它可以在没有任何思考或调试的情况下工作(并且在 x86 多核机器中并行运行),并且似乎可以很好地扩展。

OP appears to be interested in langauges that process trees and have some parallelism support.

Our DMS Software Reengineering Toolkit is a general-purpose program analysis and transformation engine. It parses programmming language source text into trees (OP's first interest), enables analysis and processing of those trees, and regeneration of source text from those trees.
DMS is driven by explicit grammars that describe the language being processed, and it has a lot of production-quality langauge definitions available for it.

DMS's tree processing machinery provide support for parallelism at two different levels, one supported directly by DMS's underlying parallel programming langauge, PARLANSE, which is inspired by LISP but isn't nearly so dynamic.

PARLANSE provides for "teams" of parallel activities called "grains" (as in sand on a beach, the idea is that there are lots of grains). Teams may be structured dynamically, by (classically) forking new grains and adding them to a (dynamic) team; this is useful but not spectacularly fast. Teams may be structured statically, including "fork this fixed size set of grains as pure parallel":

(|| a b c)

and "create a set of grains whose execution order is controlled by a specified partial order" (!| ... ). You write the partial order directly in the fork call:

(!| step1 a
    step2 b
    step3 (>> step1 step2) c
    step4 (>> step2) d )

which encodes the fact that action c occurs after (later >> in time )) actions a and b complete. The statically formed teams are precompiled by the PARLANSE compiler into pretty efficient structures in that the grains can be launched and killed very quickly, allowing pretty small grain size (a few hundred instructions). Standard parallelism libraries have lots higher overhead than this.

The basic method for processing trees is pretty conventional: there's a PARLANSE library of facilities for inspecting tree nodes, walking up and down the tree, creating new nodes and splicing them in place. Recursive procedures are often used to visit/update a tree.

The key observation here is that a tree visit which visits some set of children can be coded sequentially, or easily coded as a static parallel team. So it is pretty easy to manually code parallel tree visits, at the cost of writing lots of special cases to parallel-fork for each tree node type. There's the "divide and conquer" that seems of interest to the OP. (You can of course enumerate a nodes' childern, and using a dynamic team to fork grains for each child, but this isn't as fast). This is the first level of parallelism used to process trees in DMS.

The second level of parallelism comes through a DMS-supplied DSL that implements an attribute grammar (AG)s.

AGs are functional languages that decorate the BNF with a set of computations. One can write a simple calculator with an AG in DMS:

   sum = sum + product;
        <<Calculator>>:  sum[0].result = sum[1].result + product.result;

This causes an attribute, "result" to be computed for the root (sum[0]) by combining the result attribute from the first child (sum[1]) and the second child (product.result).
So-called synthesized attributes propagates information up the tree from the leaves; inherited attributes propagate information down from parents. Attribute grammars in general and DMS's AG allow mixing these, so information can flow up and down the tree in arbitrary order.

Most AGs are purely functional, e.g., no side effects; DMS's allows side effects which complicates the notation but is pretty useful in practice. A nice example is construction of symbol tables by attribute evaluation; current-scope is passed down the tree, local-declaration blocks create new current scopes and pass them down, and individual declarations store symbol table data into the symbol table entry received from a parent.

DMS's AG compiler analyzes the attribute computations, determines how information flows across the entire tree, and then generates a parallel PARLANSE program to implement the information flow per tree node type. It can do a data flow analysis on each AG rule to determine the information flow and what has to happen first vs later vs. last. For our simple sum rule above, it should be clear that the attributes of the children have to computed before the attribute for root can be computed. It turns out that PARLANSE's partial order construction are a perfect way to encode this information, and even nicely handles the side-effects we added to AGs.

The result is that DMS compiles an AG specification into a highly parallel PARLANSE program. Our C++ front end name/type resolver is implemented as about 150K lines of DMS AG (yes, it takes that much to describe how C++ type information is used), which compiles
to some 700K SLOC of parallel PARLANSE code. And it works (and runs in parallel in x86 multicore machines) without any thought or debugging, and it seems to scale nicely.

_蜘蛛 2024-09-26 18:25:33

同形树处理语言的典型示例是 XSLT。但您可能不想在其中写入(或阅读)任何实质性内容。

The canonical example of a homoiconic tree processing language is XSLT. But you probably don't want to write (nor read) anything substantial in it.

神仙妹妹 2024-09-26 18:25:33

我认为闭包有树而不是列表,并按照您为并发目的编写的方式使用它们。

I think closure has trees instead of lists and uses them as you wrote for concurrency purposes.

情绪少女 2024-09-26 18:25:33

我也在寻找处理同像语言的树(或更好的图),仍然没有任何要点。让我们列出一些语言所需的元素,也许我们会发现一些变体:

  • 同像性:基于树(或更好的图)解释
  • 属性语法支持语言核心
  • 结构模式匹配与统一
  • 全套典型图算法类
  • SmallTalk 交互环境具有丰富的可视化支持
  • 语言核心中的词法分析器/解析器引擎,用于任何文本数据加载
  • 嵌入式(基于 LLVM)编译器框架,用于低级代码生成和密集计算任务的优化

I'm also looking for tree (or better graph) procesing homoiconic language, still have no any points. Let's make some list of required elements of language, and maybe we find some variant:

  • homoiconicity: based on tree (or better graph) interpretation
  • attribute grammar support in language core
  • structural pattern matching with unification
  • full set of typical graph algorithms
  • SmallTalk-like interactive environment with rich visualization support
  • lexer/parser engine in language core for any textual data load
  • embedded (LLVM based) compiler framework for lowlevel code generation and optomisation for intensive computation tasks
当爱已成负担 2024-09-26 18:25:33

“Lisp”只是一个名字,而且是一个古老的名字。它并没有描述 Lisp 中所做的一切。就像 Fortran 并不完全是一个公式翻译器一样,Cobol 中的一切也并非都是“面向业务的”。

“Lisp”中提到的处理实际上是树处理,因为它是嵌套列表的处理。这是嵌套列表处理。只是名称中没有包含“嵌套”。

事实上,ANSI Common Lisp 标准 规范地定义了“tree”和“术语表中的“树结构”

Lisp 列表是树结构中的约定:右倾递归(通过 cons 单元的 cdr 槽)表示一个列表。有一个与之相配的符号:即树符号结构 ((((a . b) . c) . d) . nil) 可以缩写为 (abcd) 根据 cons 单元写成 (anything . nil) 也可以打印为 (anything) 的记法规则,以及 (foo . (bar)) 可以写成 (foo bar)

强调“列表处理”一词可能不是因为排除了树,而是因为树中的大部分生长都是沿着被理解为水平的方向:也就是说,嵌套列表的处理。

大多数在典型 Lisp 程序中处理的基于列表的数据结构,即使是嵌套的,在 cdr 维度中的深度也比在 car 维度中的深得多。

"Lisp" is only a name and an ancient one. It doesn't describe everything that is done in Lisp. Just like Fortran wasn't exclusively a formula translator, and not everything in Cobol is "business oriented".

The processing referred to in "Lisp" is actually tree processing, because it is the processing of nested lists. It is nested list processing. Just the "nested" wasn't included in the name.

The ANSI Common Lisp standard in fact normatively defines "tree" and "tree structure" in the glossary.

Lisp lists are convention in the tree structure: that the right-leaning recursion (through the cdr slot of a cons cell) represents a list. there is a notation to go with it: namely the tree notation structure ((((a . b) . c) . d) . nil) can be shortened to (a b c d) according to the notational rule that the cons cell written as (anything . nil) can also be printed as (anything), and the rule that (foo . (bar)) can be written (foo bar).

The term "list processing" was probably emphasized not because trees are ruled out but because the majority of the growth in the trees is in the direction understood as horizontal: that is to say, the processing of nested lists.

Most of the list based data structures processed in typical Lisp programs, even when nested, are much deeper in the cdr dimension than in the car dimension.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文