振奋精神,转发声明问题

发布于 2024-08-09 13:48:03 字数 702 浏览 4 评论 0原文

有人可以给我一些关于如何处理需要查看进一步声明以便能够在当前时刻做出正确语义操作的情况的建议/想法吗?例如,当某人编写某种不支持“前向声明”的编程语言的解释器/编译器时,这是众所周知的情况。让我们举个例子:

foo(123);//<-- our parser targets here. we estimate we have a function 
         //    invocation, but we have no idea about foo declaration/prototype,
         //     so we can't be sure that "foo" takes one integer argument.   


void foo(int i){
//...
}

很明显我们必须至少有两次通过。首先,我们解析所有函数声明并获取所有需要的信息,例如:函数接受的参数数量、它们的类型,然后我们就能够处理函数调用并解决上述困难。如果我们这样做,我们必须使用一些 AST 遍历机制/访问者来完成所有这些传递。在这种情况下,我们必须处理 AST 遍历/应用访问者,并且我们必须对直接集成在我们的解析器中的所有美丽的 Phoenix 结构说“再见”。

你会如何处理这个问题?

Could someone please give me some advice/ideas about how to deal with the situations when it's needed to have a look at further declarations to be able to make correct semantic actions on current moment? For example, it is a well-known occurrence when someone writes an interpreter/compiler of some programming language which doesn't support "forward declarations". Let's have an example:

foo(123);//<-- our parser targets here. we estimate we have a function 
         //    invocation, but we have no idea about foo declaration/prototype,
         //     so we can't be sure that "foo" takes one integer argument.   


void foo(int i){
//...
}

It is pretty clear we have to have at least two passes. Firstly we parse all function declarations and get all the needed information such as: the amount arguments the function takes, their types and then we are able to deal with function invocations and resolving the difficulties as above. If we go this way we will must do all these passes using some AST traversing mechanisms/visitors. In this case we have to deal with AST traversing/applying visitors and we must say "good bye" to the all the beauty of phoenix constructions integrated directly in our parsers.

How would you deal with this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

游魂 2024-08-16 13:48:03

[第二个答案,关于语义]
这个特定的例子很简单。您可以做的是记录对尚未声明的函数进行的函数调用以及实际的参数类型。当您稍后遇到函数声明时,您会检查是否存在与此新函数(更好)匹配的先前函数调用。显然,您只会在解析结束时检测到错误,因为最后一行可能会引入丢失的函数。但在那一行之后,任何根本不匹配的函数调用都是错误的。

现在的问题是这适用于简单的语义。如果您查看更复杂的语言 - 例如使用类似 C++ 的函数模板 - 就不再可能在简单的表中进行此类查找。您需要在结构上与您的语言结构相匹配的专用选项卡。 AST 并不是最好的结构,更不用说解析期间的部分 AST 了。

[2nd answer, on semantics]
This particular example happens to be simple. What you can do is record function calls made to yet undeclared functions, and the actual argument types. When you do encounter a function declaration later, you check if there are preceding function calls that are (better) matched to this new function. You will obviously detect errors only at the end of the parse, becuase the very last line could introduce a missing function. But after that line, any function call that hasn't been matched at all is an error.

Now, the problem is that this works for simple semantics. If you look at more complex languages - e.g. with C++-like function templates - it no longer becomes possible to do such lookups in a simple table. You would need specialized tabes that structurally match your language constructs. An AST just isn't the best structure for those, let alone the partial AST during parsing.

青萝楚歌 2024-08-16 13:48:03

如果你想做两遍,而不是在第一遍结束时进行语义检查,你可以让你的动作调用的函数知道它们在哪一遍。所以,如果你有一些动作,

[functionCall(name, args)]
[functionDef(name, args, body)]

它们将被定义为这样的(不是正确的精神语法,但你明白了)

functionCall(string name, vector<string> args)
{
  if (!first_pass) {
    // check args for validity
    // whatever else you need to do
  }
} 

functionDef(string name, vector<string> args, ... body)
{
  if (first_pass)
    // Add function decleration to symbol table
  else
    // define function
}

If you want to do two passes, instead of semantic checking at the end of the first pass, you can have te functions called by your actions know which pass they are in. So if you had some actions

[functionCall(name, args)]
[functionDef(name, args, body)]

They would be defined something like this (not proper spirit syntax, but you get the point)

functionCall(string name, vector<string> args)
{
  if (!first_pass) {
    // check args for validity
    // whatever else you need to do
  }
} 

functionDef(string name, vector<string> args, ... body)
{
  if (first_pass)
    // Add function decleration to symbol table
  else
    // define function
}
南风起 2024-08-16 13:48:03

我认为你正在做出毫无根据的假设。例如,“很明显我们必须至少有两次通过”。不,不是。如果语法使得 foo(123) 只能解析为 function-name "(" expression ")",那么一次就足够了。

因此,我建议设计语法以进行明确的解析。避免无法单独解析的构造,例如避免对其他地方的声明的依赖。

I think you're making unfounded assumptions. For instance, "it is pretty clear we have to have at least two passes". No it isn't. If the syntax is such that foo(123) can only be parsed as function-name "(" expression ")", then one pass is enough.

Therefore I would advise to design your syntax for unambiguous parsing. Avoid constructs that cannot be parsed in isolation, , e.g. avoid dependendencies on declarations elesewhere.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文