以功能纯方式生成不可变的具体语法树的适当数据结构或算法是什么?

发布于 2024-09-13 00:58:26 字数 1512 浏览 7 评论 0原文

给定一个LL(1)语法,以功能纯方式生成不可变的具体语法树的适当数据结构或算法是什么?请随意用您喜欢的任何语言编写示例代码。

我的想法

symbol : either a token or a node

result : success or failure

token : a lexical token from source text
    value -> string : the value of the token
    type -> integer : the named type code of the token
    next -> token : reads the next token and keeps position of the previous token
    back -> token : moves back to the previous position and re-reads the token

node : a node in the syntax tree 
    type -> integer : the named type code of the node
    symbols -> linkedList : the symbols found at this node
    append -> symbol -> node : appends the new symbol  to a new copy of the node

这是我一直在思考的一个想法。这里的主要问题是处理语法错误。 我的意思是我可以在第一个错误处停止,但这似乎不对。

let program token =
    sourceElements (node nodeType.program) token        

let sourceElements node token =
    let (n, r) = sourceElement (node.append (node nodeType.sourceElements)) token
    match r with
    | success -> (n, r) 
    | failure -> // ???     

let sourceElement node token =
    match token.value with
    | "function" -> 
        functionDeclaration (node.append (node nodeType.sourceElement)) token
    | _ -> 
        statement (node.append (node nodeType.sourceElement)) token 

请注意

我将为最佳答案提供丰厚的赏金,所以不要感到匆忙。与显示代码或包含详细解释的答案相比,仅发布链接的答案的权重较小。

最后说明

我对这类东西真的很陌生,所以不要害怕称我为傻瓜。

Given a LL(1) grammar what is an appropriate data structure or algorithm for producing an immutable concrete syntax tree in a functionally pure manner? Please feel free to write example code in whatever language you prefer.

My Idea

symbol : either a token or a node

result : success or failure

token : a lexical token from source text
    value -> string : the value of the token
    type -> integer : the named type code of the token
    next -> token : reads the next token and keeps position of the previous token
    back -> token : moves back to the previous position and re-reads the token

node : a node in the syntax tree 
    type -> integer : the named type code of the node
    symbols -> linkedList : the symbols found at this node
    append -> symbol -> node : appends the new symbol  to a new copy of the node

Here is an idea I have been thinking about. The main issue here is handling syntax errors.
I mean I could stop at the first error but that doesn't seem right.

let program token =
    sourceElements (node nodeType.program) token        

let sourceElements node token =
    let (n, r) = sourceElement (node.append (node nodeType.sourceElements)) token
    match r with
    | success -> (n, r) 
    | failure -> // ???     

let sourceElement node token =
    match token.value with
    | "function" -> 
        functionDeclaration (node.append (node nodeType.sourceElement)) token
    | _ -> 
        statement (node.append (node nodeType.sourceElement)) token 

Please Note

I will be offering up a nice bounty to the best answer so don't feel rushed. Answers that simply post a link will have less weight over answers that show code or contain detailed explanations.

Final Note

I am really new to this kind of stuff so don't be afraid to call me a dimwit.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

痴意少年 2024-09-20 00:58:26

您想要将某些内容解析为抽象语法树。

在纯函数式编程语言 Haskell 中,您可以使用解析器组合器来表达您的语法。这里有一个解析小型表达式语言的示例:

编辑使用单子样式来匹配 Graham Hutton 的书

    -- import a library of *parser combinators*
import Parsimony
import Parsimony.Char
import Parsimony.Error
(+++) = (<|>)

    -- abstract syntax tree
data Expr = I Int
          | Add Expr Expr
          | Mul Expr Expr
          deriving (Eq,Show)

    -- parse an expression
parseExpr :: String -> Either ParseError Expr
parseExpr = Parsimony.parse pExpr
    where

        -- grammar
    pExpr :: Parser String Expr
    pExpr = do
        a <- pNumber +++ parentheses pExpr  -- first argument
        do
            f <- pOp                        -- operation symbol
            b <- pExpr                      -- second argument
            return (f a b)
         +++ return a                       -- or just the first argument

    parentheses parser = do                 -- parse inside parentheses
        string "("
        x <- parser
        string ")"
        return x

    pNumber = do                            -- parse an integer
        digits <- many1 digit
        return . I . read $ digits

    pOp =                                   -- parse an operation symbol
         do string "+"
            return Add
        +++ 
         do string "*"
            return Mul

这里有一个示例运行:

*Main> parseExpr "(5*3)+1"
Right (Add (Mul (I 5) (I 3)) (I 1))

要了解有关解析器组合器的更多信息,请参阅 Graham Hutton 的书“Programming in Haskell”“真实世界 Haskell”第 16 章

许多解析器组合器库可以按照您的意愿与不同的标记类型一起使用。令牌流通常表示为令牌列表[Token]

You want to parse something into an abstract syntax tree.

In the purely functional programming language Haskell, you can use parser combinators to express your grammar. Here an example that parses a tiny expression language:

EDIT Use monadic style to match Graham Hutton's book

    -- import a library of *parser combinators*
import Parsimony
import Parsimony.Char
import Parsimony.Error
(+++) = (<|>)

    -- abstract syntax tree
data Expr = I Int
          | Add Expr Expr
          | Mul Expr Expr
          deriving (Eq,Show)

    -- parse an expression
parseExpr :: String -> Either ParseError Expr
parseExpr = Parsimony.parse pExpr
    where

        -- grammar
    pExpr :: Parser String Expr
    pExpr = do
        a <- pNumber +++ parentheses pExpr  -- first argument
        do
            f <- pOp                        -- operation symbol
            b <- pExpr                      -- second argument
            return (f a b)
         +++ return a                       -- or just the first argument

    parentheses parser = do                 -- parse inside parentheses
        string "("
        x <- parser
        string ")"
        return x

    pNumber = do                            -- parse an integer
        digits <- many1 digit
        return . I . read $ digits

    pOp =                                   -- parse an operation symbol
         do string "+"
            return Add
        +++ 
         do string "*"
            return Mul

Here an example run:

*Main> parseExpr "(5*3)+1"
Right (Add (Mul (I 5) (I 3)) (I 1))

To learn more about parser combinators, see for example chapter 8 of Graham Hutton's book "Programming in Haskell" or chapter 16 of "Real World Haskell".

Many parser combinator library can be used with different token types, as you intend to do. Token streams are usually represented as lists of tokens [Token].

十年不长 2024-09-20 00:58:26

一定要检查一元解析器组合器方法;我已经在博客中介绍了它 C#F# 中。

Definitely check out the monadic parser combinator approach; I've blogged about it in C# and in F#.

天冷不及心凉 2024-09-20 00:58:26

Eric Lippert 的 博客系列不可变的二叉树可能会有所帮助。显然,您需要一棵非二叉树,但它会给您总体思路。

Eric Lippert's blog series on immutable binary trees may be helpful. Obviously, you need a tree which is not binary, but it will give you the general idea.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文