使用 PLY 为一个解析器提供多个词法分析器？

发布于 2024-08-10 10:41:29 字数 859 浏览 15 评论 0原文

我正在尝试使用 PLY 为 Kconfig 语言实现一个 python 解析器，该语言用于生成 Linux 内核的配置选项。

有一个名为 source 的关键字执行包含操作，所以我所做的是，当词法分析器遇到此关键字时，我更改词法分析器状态以创建一个新的词法分析器，它将对源文件进行词法分析：

def t_begin_source(t):
    r'source '
    t.lexer.begin('source')

def t_source_path(t):
    r'[^\n]+\n+'
    t.lexer.begin('INITIAL') 
    global path
    source_lexer = lex.lex(errorlog=lex.NullLogger())
    source_file_name = (path +  t.value.strip(' \"\n')) 
    sourced_file = file(path + t.value.strip(' \"\n')).read()

    source_lexer.input(sourced_file)

    while True:
        tok = source_lexer.token()
        if not tok:
            break

在其他地方我有此行

lexer = lex.lex(errorlog=lex.NullLogger())

这是将由解析器调用的“main”或“root”词法分析器。

我的问题是我不知道如何告诉解析器使用不同的词法分析器或告诉“source_lexer”返回某些内容...

也许应该使用克隆函数...

谢谢

原文

I'm trying to implement a python parser using PLY for the Kconfig language used to generate the configuration options for the linux kernel.

There's a keyword called source which performs an inclusion, so what i do is that when the lexer encounters this keyword, I change the lexer state to create a new lexer which is going to lex the sourced file:

def t_begin_source(t):
    r'source '
    t.lexer.begin('source')

def t_source_path(t):
    r'[^\n]+\n+'
    t.lexer.begin('INITIAL') 
    global path
    source_lexer = lex.lex(errorlog=lex.NullLogger())
    source_file_name = (path +  t.value.strip(' \"\n')) 
    sourced_file = file(path + t.value.strip(' \"\n')).read()

    source_lexer.input(sourced_file)

    while True:
        tok = source_lexer.token()
        if not tok:
            break

Somewhere else I have this line

lexer = lex.lex(errorlog=lex.NullLogger())

This is the "main" or "root" lexer which is going to be called by the parser.

My problem is that I don't know how to tell the parser to use a different lexer or to tell the "source_lexer" to return something...

Maybe the clone function should be used...

Thanks

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

も星光 2024-08-17 10:41:30

一个有趣的巧合是，来自同一个谷歌搜索的链接引导我找到了这个问题，解释了如何为 PLY 解析器编写您自己的词法分析器。这篇文章简单明了地解释了它，但它是四个实例变量和单个 token 方法的问题。

回复收藏 0 原文

蓝咒 2024-08-17 10:41:30

好的，

所以我所做的是构建所有标记的列表，该列表是在实际解析之前构建的。

解析器不再调用词法分析器，因为您可以在调用解析函数时使用 tokenfunc 参数覆盖解析器使用的 getToken 函数。

result = yacc.parse(kconfig,debug=1,tokenfunc=my_function)

我的函数现在是调用以获取下一个令牌的函数，该函数迭代先前构建的令牌列表。

考虑到词法分析，当我遇到源关键字时，我会克隆词法分析器并更改输入以包含该文件。

def sourcing_file(source_file_name):
    print "SOURCE FILE NAME " , source_file_name
    sourced_file = file(source_file_name).read()
    source_lexer = lexer.clone()
    source_lexer.input(sourced_file)
    print 'END OF SOURCING FILE'

    while True:
        tok = source_lexer.token()
        if not tok:
            break
        token_list.append(tok)

Ok,

so what i've done is building a list of all the tokens, which is built before the actual parsing.

The parser no longer calls the lexer because you can override the getToken function used by the parser using the tokenfunc parameter when calling the parse function.

result = yacc.parse(kconfig,debug=1,tokenfunc=my_function)

and my function which is now the function called to get the next token iterates over the list of tokens previously built.

Considering the lexing, when I encounter a source keyword, I clone my lexer and change the input to include the file.

def sourcing_file(source_file_name):
    print "SOURCE FILE NAME " , source_file_name
    sourced_file = file(source_file_name).read()
    source_lexer = lexer.clone()
    source_lexer.input(sourced_file)
    print 'END OF SOURCING FILE'

    while True:
        tok = source_lexer.token()
        if not tok:
            break
        token_list.append(tok)

回复收藏 0 原文