词汇错误的例子是什么?一种语言是否可能没有词汇错误?

发布于 2024-09-14 01:13:41 字数 175 浏览 7 评论 0原文

在我们的编译器理论课程中,我们的任务是为我们自己设计的编程语言创建一个简单的解释器。我使用 jflex 和 cup 作为我的生成器,但我对词汇错误是什么感到有点困惑。另外,是否建议我使用jflex的状态功能?感觉不对,因为解析器似乎更适合处理这方面。您是否推荐任何其他工具来创建该语言?如果我不耐烦的话,我很抱歉,但它的截止日期是周二。

for our compiler theory class, we are tasked with creating a simple interpreter for our own designed programming language. I am using jflex and cup as my generators but i'm a bit stuck with what a lexical error is. Also, is it recommended that i use the state feature of jflex? it feels wrong as it seems like the parser is better suited to handling that aspect. and do you recommend any other tools to create the language. I'm sorry if i'm impatient but it's due on tuesday.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

玉环 2024-09-21 01:13:41

词法错误是词法分析器可以拒绝的任何输入。这通常是由于令牌识别脱离了您定义的规则的末尾而导致的。例如(没有特定的语法):

[0-9]+   ===> NUMBER token
[a-zA-Z] ===> LETTERS token
anything else ===> error!

如果您将词法分析器视为接受有效输入字符串的有限状态机,那么错误将是任何不会导致该有限状态机达到接受状态的输入字符串。

你的问题的其余部分对我来说相当不清楚。如果您已经拥有一些正在使用的工具,那么也许您最好学习如何使用这些工具来实现您想要实现的目标(我对您提到的任何工具都没有经验)。

编辑:重新阅读你的问题后,我可以回答第二部分。一种语言可能没有词汇错误——在这种语言中,任何输入字符串都是有效输入。

A lexical error is any input that can be rejected by the lexer. This generally results from token recognition falling off the end of the rules you've defined. For example (in no particular syntax):

[0-9]+   ===> NUMBER token
[a-zA-Z] ===> LETTERS token
anything else ===> error!

If you think about a lexer as a finite state machine that accepts valid input strings, then errors are going to be any input strings that do not result in that finite state machine reaching an accepting state.

The rest of your question was rather unclear to me. If you already have some tools you are using, then perhaps you're best to learn how to achieve what you want to achieve using those tools (I have no experience with either of the tools you mentioned).

EDIT: Having re-read your question, there's a second part I can answer. It is possible that a language could have no lexical errors - it's the language in which any input string at all is valid input.

饭团 2024-09-21 01:13:41

词法错误可能是语言无效或不可接受的字符,例如“@”,它在 Java 中被作为标识符的词法错误而被拒绝(它是保留的)。

词法错误是词法分析器无法继续时抛出的错误。这意味着无法将词法识别为词法分析器的有效标记。另一方面,当给定的一组已识别有效标记与语法规则的任何右侧都不匹配时,扫描器将抛出语法错误。

感觉不对,因为看起来像
解析器更适合处理
那方面

因为上下文无关语言包括常规语言(这意味着解析器可以完成词法分析器的工作)。但考虑到解析器是一个堆栈自动机,您将使用额外的计算机资源(堆栈)来识别不需要堆栈来识别的内容(正则表达式)。这将是一个次优的解决方案。

注意:通过正则表达式,我的意思是...乔姆斯基层次结构意义上的正则表达式,而不是 java.util.regex.* 类。

A lexical error could be an invalid or unacceptable character by the language, like '@' which is rejected as a lexical error for identifiers in Java (it's reserved).

Lexical errors are the errors thrown by your lexer when unable to continue. Which means that there's no way to recognise a lexeme as a valid token for you lexer. Syntax errors, on the other side, will be thrown by your scanner when a given set of already recognised valid tokens don't match any of the right sides of your grammar rules.

it feels wrong as it seems like the
parser is better suited to handling
that aspect

No. It seems because context-free languages include regular languages (meaning than a parser can do the work of a lexer). But consider than a parser is a stack automata, and you will be employing extra computer resources (the stack) to recognise something that doesn't require a stack to be recognised (a regular expression). That would be a suboptimal solution.

NOTE: by regular expression, I mean... regular expression in the Chomsky Hierarchy sense, not a java.util.regex.* class.

帅哥哥的热头脑 2024-09-21 01:13:41

词法错误是指输入不属于以下任何列表:
关键词:“if”、“else”、“main”...
符号:'=','+',';'...
双符号:">="、"<="、"!="、"++"
变量:[az/AZ]+[0-9]*
数字:[0-9]*

示例:9var:错误,字符前的数字,不是变量,也不是关键字。
$: error

我不知道的是是否接受诸如“+-”之类的多个符号之类的东西

lexical error is when the input doesn't belong to any of these lists:
key words: "if", "else", "main"...
symbols: '=','+',';'...
double symbols: ">=", "<=", "!=", "++"
variables: [a-z/A-Z]+[0-9]*
numbers: [0-9]*

examples: 9var: error, number before characters, not a variable and not a key word either.
$: error

what I don't know is whether something like more than one symbol after each other is accepted, like "+-"

奈何桥上唱咆哮 2024-09-21 01:13:41

当编译器包含语法时,它可以捕获错误!
是否具有捕获词法错误的能力(范围)取决于编译器本身。
如果在编译器的开发过程中决定什么类型的词法错误以及如何(根据语法)处理它们。
通常所有著名且最常用的编译器都具有此功能。

Compiler can catch an error when it has the grammar in it!
It will depend on the compiler itself whether it has the capacity (scope) of catching the lexical errors or not.
If is decided during the development of compiler what types of lexical error and how (according to the grammar) they are going to be handled.
Usually all famous and mostly used compiler has this capabilities.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文