使用 PLY 为一个解析器提供多个词法分析器?

发布于 2024-08-10 10:41:29 字数 859 浏览 15 评论 0原文

我正在尝试使用 PLY 为 Kconfig 语言实现一个 python 解析器,该语言用于生成 Linux 内核的配置选项。

有一个名为 source 的关键字执行包含操作,所以我所做的是,当词法分析器遇到此关键字时,我更改词法分析器状态以创建一个新的词法分析器,它将对源文件进行词法分析:

def t_begin_source(t):
    r'source '
    t.lexer.begin('source')

def t_source_path(t):
    r'[^\n]+\n+'
    t.lexer.begin('INITIAL') 
    global path
    source_lexer = lex.lex(errorlog=lex.NullLogger())
    source_file_name = (path +  t.value.strip(' \"\n')) 
    sourced_file = file(path + t.value.strip(' \"\n')).read()

    source_lexer.input(sourced_file)

    while True:
        tok = source_lexer.token()
        if not tok:
            break

在其他地方我有此行

lexer = lex.lex(errorlog=lex.NullLogger()) 

这是将由解析器调用的“main”或“root”词法分析器。

我的问题是我不知道如何告诉解析器使用不同的词法分析器或告诉“source_lexer”返回某些内容...

也许应该使用克隆函数...

谢谢

I'm trying to implement a python parser using PLY for the Kconfig language used to generate the configuration options for the linux kernel.

There's a keyword called source which performs an inclusion, so what i do is that when the lexer encounters this keyword, I change the lexer state to create a new lexer which is going to lex the sourced file:

def t_begin_source(t):
    r'source '
    t.lexer.begin('source')

def t_source_path(t):
    r'[^\n]+\n+'
    t.lexer.begin('INITIAL') 
    global path
    source_lexer = lex.lex(errorlog=lex.NullLogger())
    source_file_name = (path +  t.value.strip(' \"\n')) 
    sourced_file = file(path + t.value.strip(' \"\n')).read()

    source_lexer.input(sourced_file)

    while True:
        tok = source_lexer.token()
        if not tok:
            break

Somewhere else I have this line

lexer = lex.lex(errorlog=lex.NullLogger()) 

This is the "main" or "root" lexer which is going to be called by the parser.

My problem is that I don't know how to tell the parser to use a different lexer or to tell the "source_lexer" to return something...

Maybe the clone function should be used...

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

も星光 2024-08-17 10:41:30

一个有趣的巧合是,来自同一个谷歌搜索的链接引导我找到了这个问题,解释了如何为 PLY 解析器编写您自己的词法分析器。这篇文章简单明了地解释了它,但它是四个实例变量和单个 token 方法的问题。

By an interesting coincidence a link from the same Google search that led me to this question explains how to write your own lexer for a PLY parser. The post explains it simply and well, but it's a matter of four instance variables and single token method.

蓝咒 2024-08-17 10:41:30

好的,

所以我所做的是构建所有标记的列表,该列表是在实际解析之前构建的。

解析器不再调用词法分析器,因为您可以在调用解析函数时使用 tokenfunc 参数覆盖解析器使用的 getToken 函数。

result = yacc.parse(kconfig,debug=1,tokenfunc=my_function)

我的函数现在是调用以获取下一个令牌的函数,该函数迭代先前构建的令牌列表。

考虑到词法分析,当我遇到源关键字时,我会克隆词法分析器并更改输入以包含该文件。

def sourcing_file(source_file_name):
    print "SOURCE FILE NAME " , source_file_name
    sourced_file = file(source_file_name).read()
    source_lexer = lexer.clone()
    source_lexer.input(sourced_file)
    print 'END OF SOURCING FILE'

    while True:
        tok = source_lexer.token()
        if not tok:
            break
        token_list.append(tok)

Ok,

so what i've done is building a list of all the tokens, which is built before the actual parsing.

The parser no longer calls the lexer because you can override the getToken function used by the parser using the tokenfunc parameter when calling the parse function.

result = yacc.parse(kconfig,debug=1,tokenfunc=my_function)

and my function which is now the function called to get the next token iterates over the list of tokens previously built.

Considering the lexing, when I encounter a source keyword, I clone my lexer and change the input to include the file.

def sourcing_file(source_file_name):
    print "SOURCE FILE NAME " , source_file_name
    sourced_file = file(source_file_name).read()
    source_lexer = lexer.clone()
    source_lexer.input(sourced_file)
    print 'END OF SOURCING FILE'

    while True:
        tok = source_lexer.token()
        if not tok:
            break
        token_list.append(tok)
风筝有风,海豚有海 2024-08-17 10:41:29

我不知道 PLY 的细节,但在我构建的其他类似系统中,使用单个词法分析器来管理包含文件堆栈是最有意义的。因此,词法分析器将返回统一的标记流,并在遇到包含文件时打开和关闭它们。

I don't know about the details of PLY, but in other systems like this that I've built, it made the most sense to have a single lexer which managed the stack of include files. So the lexer would return a unified stream of tokens, opening and closing include files as they were encountered.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文