C 中的字符串操作

发布于 2024-09-02 10:39:13 字数 447 浏览 8 评论 0原文

我正在帮助我的侄子完成他的 C 实验室作业，这是一个字符串操作作业并应用 Wang 的算法。

这是输入的 BNF 表示。

<s> ::= <l> # <r>

<l> ::= <list>| ε
<r> ::= <list>| ε
<list> ::= <f>|<f> , <list>
<f> ::= <letter>| - <f>| (<f><op><f>)
<op> ::= & | | | >
<letter> ::= A |... | Z

在 C 中处理和解析此类输入的最佳实践是什么？如何在不使用 struct 的情况下解析此结构？提前致谢。

原文

I am helping my nephew for his C lab homework, it is a string manipulation assignment and applying Wang's algorithm.

Here is the BNF representation for the input.

<s> ::= <l> # <r>

<l> ::= <list>| ε
<r> ::= <list>| ε
<list> ::= <f>|<f> , <list>
<f> ::= <letter>| - <f>| (<f><op><f>)
<op> ::= & | | | >
<letter> ::= A |... | Z

What is the best practice to handle and parse this kind of input in C? How can I parse this structure without using struct? Thanks in advance.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

趁微风不噪 2024-09-09 10:39:13

最直接的方法是使每个规则（或“产生式”）成为一个函数。这称为“递归下降”解析器。

编写两个例程来查看并获取下一个字符。

对于每个规则，这将为您提供一些看起来像这样的代码（伪代码）：

// <sequent> ::= <lhs> # <rhs>
sequent()
    lhs()
    if peekchar() != '#' then error
    else poundsign = nextchar()
    rhs()


// <lhs> ::= <formulalist>| ε
lhs()
    if peekchar() == EOF
        return
    else
       formula()

// <rhs> ::= <formulalist>| ε
rhs()
    if peekchar() == EOF
        return
    else
       formulalist()

// <formulalist> ::= <formula>|<formula> , <formulalist>
formulalist()
   formula()
   if peekchar() is ','
       comma = nextchar()
       return formulalist()

// <formula> ::= <letter>| - <formula>| (<formula><inﬁxop><formula>)
formula()
    next = peekchar()
    if next in A..Z
        letter
    else if next is -
        minus_sign = nextchar()
        return formula()
    else
        formula()
        infixop()
        formula()

// <inﬁxop> ::= & | | | >
infixop()
    c = nextchar()
    if c not in &,|,> then error

// <letter> ::= A | B | ... | Z
letter()
    c = nextchar()
    if c not A..Z then error

等等。

总体思路：

每条规则都是一个函数，
在某些点上该函数会向前查看要做什么。例如，formula() 检查第一个字符是否是减号。

The most straightforward approach is to make every rule (or "production") a function. This is called a "recursive descent" parser.

Write two routine that will peek at and get the next character as well.

This will give you some code that looks something like this (in pseudocode):

// <sequent> ::= <lhs> # <rhs>
sequent()
    lhs()
    if peekchar() != '#' then error
    else poundsign = nextchar()
    rhs()


// <lhs> ::= <formulalist>| ε
lhs()
    if peekchar() == EOF
        return
    else
       formula()

// <rhs> ::= <formulalist>| ε
rhs()
    if peekchar() == EOF
        return
    else
       formulalist()

// <formulalist> ::= <formula>|<formula> , <formulalist>
formulalist()
   formula()
   if peekchar() is ','
       comma = nextchar()
       return formulalist()

// <formula> ::= <letter>| - <formula>| (<formula><inﬁxop><formula>)
formula()
    next = peekchar()
    if next in A..Z
        letter
    else if next is -
        minus_sign = nextchar()
        return formula()
    else
        formula()
        infixop()
        formula()

// <inﬁxop> ::= & | | | >
infixop()
    c = nextchar()
    if c not in &,|,> then error

// <letter> ::= A | B | ... | Z
letter()
    c = nextchar()
    if c not A..Z then error

and so forth, for each rule.

The general idea:

each rule is a function
at certain points the function peeks ahead to see what to do. for example, formula() checks to see if the first character is a minus sign.

回复收藏 0 原文