选择是否对 F# 中的小型 AST 使用可区分联合或记录类型

发布于 2024-12-01 10:15:04 字数 475 浏览 7 评论 0原文

假设我正在实现一个非常简单的玩具语言解析器。我正在决定是使用 DU 还是记录类型（也许两者混合？）。该语言的结构如下：

a Namespace consists of a name and a list of classes
a Class consists of a name and a list of methods
Method consists of a name, return type and a list of Arguments
Argument consists of a type and a name

使用这种简单语言的程序示例：

namespace ns {
  class cls1 {
    void m1() {}
  }

  class cls2 {
    void m2(int i, string j) {}
  }
}

您将如何对此进行建模以及为什么？

原文

Let's say I am implementing a very simple toy language parser. I am deciding whether to use DUs or record types (maybe a mix of both?). The structure of the language would be:

a Namespace consists of a name and a list of classes
a Class consists of a name and a list of methods
Method consists of a name, return type and a list of Arguments
Argument consists of a type and a name

Example of a program in this simple language:

namespace ns {
  class cls1 {
    void m1() {}
  }

  class cls2 {
    void m2(int i, string j) {}
  }
}

How would you model this and why?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

恏ㄋ傷疤忘ㄋ疼 2024-12-08 10:15:04

您几乎肯定想要使用 DU 来实现交替，其中代码结构的任何部分都可能是多种可能性之一。混合可能是理想的，尽管您可以使用元组代替记录 - 这可能会使使用更简单，但可能更难以阅读和维护，因为元组中没有命名项。

我将其建模为类似这样的内容

type CompilationUnit = | Namespace list

and Namespace = { Name : String
                  Body : NamespaceBody }

and NamespaceBody = | Classes of Class list

and Class = { Name : String
              Body : ClassBody }

and ClassBody = | Members of Member list

and Member = | Method of Method

and Method = { Name : String
               Parameters : Parameter list option
               ReturnType : TypeName option
               Body : MethodBody }

and Parameter = { Name : String
                  Type : TypeName }

and MethodBody = ...

and TypeName = ...

使用您的示例语言可能并不明显对 DU 的需求，但一旦您在代码中有任何可能是一个或多个项目的点，就会变得清晰。例如，如果您向类添加字段 - 您只需要向 Member 添加新的 Field 区分。

如果您使用语法来解析您的语言（LL/LALR 或类似语言），您可能需要为语法中的每个交替规则匹配一个 DU。

You almost surely want to use DUs to implement alternations, where any part of the code structure could be one of multiple possibilities. A mix would probably be ideal, although you could use tuples in place of records - which may make it simpler to use, but perhaps more difficult to read and maintain, because you don't have named items in the tuples.

I would model it as something like this

type CompilationUnit = | Namespace list

and Namespace = { Name : String
                  Body : NamespaceBody }

and NamespaceBody = | Classes of Class list

and Class = { Name : String
              Body : ClassBody }

and ClassBody = | Members of Member list

and Member = | Method of Method

and Method = { Name : String
               Parameters : Parameter list option
               ReturnType : TypeName option
               Body : MethodBody }

and Parameter = { Name : String
                  Type : TypeName }

and MethodBody = ...

and TypeName = ...

The need for DUs might not be apparent using your example language, but will become clear as soon as you have any point in code which could be one or more items. Say, eg, if you add fields to your class - you'll just need to add a new Field discrimination to Member.

If you're using a grammar to parse your language (LL/LALR or similar), you'll probably need a matching DU for each alternation rule you have in the grammar.

回复收藏 0 原文

猫卆 2024-12-08 10:15:04

命名空间由名称和类列表组成
类由名称和方法列表组成
方法由名称、返回类型和参数列表组成
参数由类型和名称组成
使用这种简单语言的程序示例：

您还需要类型系统的类型定义，这实际上是联合类型有价值的唯一地方：

type Type = Void | Int | String

因此您语言中的类型可以是 int 或 string 或 void，但不能什么都不是（例如 null），并且不能超过这些选项之一。

命名空间的类型可以是完全匿名的，如下所示：

string * (string * (Type * string * (Type * string) list) list) list

您可以这样定义示例命名空间：

"ns", ["cls1", [Void, "m1", []]
       "cls2", [Void, "m2", [Int, "i"; String, "j"]]]

在实践中，您可能希望能够将命名空间放入其他命名空间并将类放入类中，以便您可以将代码演变为某种东西像这样：

type Type =
  | Void
  | Int
  | String
  | Class of Map<string, Type> * Map<string, Type * (Type * string) list>

type Namespace =
  | Namespace of string * Namespace list * Map<string, Type>

Namespace("ns", [],
          Map
            [ "cls1", Class(Map[], Map["m1", (Void, [])])
              "cls2", Class(Map[], Map["m2", (Void, [Int, "i"; String, "j"])])])

只要匿名类型不会造成混乱，就可以。根据经验，如果您有两个或三个字段并且它们具有不同类型（如此处的“方法”），那么元组就可以了。如果有更多字段或多个字段具有相同类型，则需要切换到记录类型。

因此，在这种情况下，您可能需要引入方法的记录类型：

type Method =
  { ReturnType: Type
    Arguments: (Type * string) list }

and Type =
  | Void
  | Int
  | String
  | Class of Map<string, Type> * Map<string, Method>

type Namespace =
  | Namespace of string * Namespace list * Map<string, Type>

Namespace("ns", [],
          Map
            [ "cls1", Class(Map[], Map["m1", { ReturnType = Void; Arguments = [] }])
              "cls2", Class(Map[], Map["m2", { ReturnType = Void; Arguments = [Int, "i"; String, "j"] }])])

以及可能的辅助函数来构造这些记录：

let Method retTy name args =
  name, { ReturnType = retTy; Arguments = args }

Namespace("ns", [],
          Map
            [ "cls1", Class(Map[], Map[Method Void "m1" []])
              "cls2", Class(Map[], Map[Method Void "m2" [Int, "i"; String, "j"]])])

a Namespace consists of a name and a list of classes
a Class consists of a name and a list of methods
Method consists of a name, return type and a list of Arguments
Argument consists of a type and a name
Example of a program in this simple language:

You also need a type definition for your type system and that is actually the only place where a union type is valuable:

type Type = Void | Int | String

So a type in your language is either an int or a string or void but cannot be nothing (e.g. null) and cannot be more than one of those options.

The type of a namespace could be entirely anonymous, like this:

string * (string * (Type * string * (Type * string) list) list) list

You could define your example namespace like this:

"ns", ["cls1", [Void, "m1", []]
       "cls2", [Void, "m2", [Int, "i"; String, "j"]]]

In practice, you probably want the ability to put namespaces in other namespaces and to put classes in classes so you might evolve the code into something like this:

type Type =
  | Void
  | Int
  | String
  | Class of Map<string, Type> * Map<string, Type * (Type * string) list>

type Namespace =
  | Namespace of string * Namespace list * Map<string, Type>

Namespace("ns", [],
          Map
            [ "cls1", Class(Map[], Map["m1", (Void, [])])
              "cls2", Class(Map[], Map["m2", (Void, [Int, "i"; String, "j"])])])

Anonymous types are fine as long as they won't be a source of confusion. As a rule of thumb, if you have two or three fields and they are of different types (like a "method" here) then a tuple is fine. If there are more fields or multiple fields with the same type then it is time to switch to a record type.

So in this case you might want to introduce a record type for methods:

type Method =
  { ReturnType: Type
    Arguments: (Type * string) list }

and Type =
  | Void
  | Int
  | String
  | Class of Map<string, Type> * Map<string, Method>

type Namespace =
  | Namespace of string * Namespace list * Map<string, Type>

Namespace("ns", [],
          Map
            [ "cls1", Class(Map[], Map["m1", { ReturnType = Void; Arguments = [] }])
              "cls2", Class(Map[], Map["m2", { ReturnType = Void; Arguments = [Int, "i"; String, "j"] }])])

and maybe a helper function to construct those records:

let Method retTy name args =
  name, { ReturnType = retTy; Arguments = args }

Namespace("ns", [],
          Map
            [ "cls1", Class(Map[], Map[Method Void "m1" []])
              "cls2", Class(Map[], Map[Method Void "m2" [Int, "i"; String, "j"]])])

回复收藏 0 原文

~没有更多了~