C 中的字符串操作

发布于 2024-09-02 10:39:13 字数 447 浏览 2 评论 0原文

我正在帮助我的侄子完成他的 C 实验室作业,这是一个字符串操作作业并应用 Wang 的算法。

这是输入的 BNF 表示。

<s> ::= <l> # <r>

<l> ::= <list>| ε
<r> ::= <list>| ε
<list> ::= <f>|<f> , <list>
<f> ::= <letter>| - <f>| (<f><op><f>)
<op> ::= & | | | >
<letter> ::= A |... | Z

在 C 中处理和解析此类输入的最佳实践是什么?如何在不使用 struct 的情况下解析此结构?提前致谢。

I am helping my nephew for his C lab homework, it is a string manipulation assignment and applying Wang's algorithm.

Here is the BNF representation for the input.

<s> ::= <l> # <r>

<l> ::= <list>| ε
<r> ::= <list>| ε
<list> ::= <f>|<f> , <list>
<f> ::= <letter>| - <f>| (<f><op><f>)
<op> ::= & | | | >
<letter> ::= A |... | Z

What is the best practice to handle and parse this kind of input in C? How can I parse this structure without using struct? Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

趁微风不噪 2024-09-09 10:39:13

最直接的方法是使每个规则(或“产生式”)成为一个函数。这称为“递归下降”解析器。

编写两个例程来查看并获取下一个字符。

对于每个规则,这将为您提供一些看起来像这样的代码(伪代码):

// <sequent> ::= <lhs> # <rhs>
sequent()
    lhs()
    if peekchar() != '#' then error
    else poundsign = nextchar()
    rhs()


// <lhs> ::= <formulalist>| ε
lhs()
    if peekchar() == EOF
        return
    else
       formula()

// <rhs> ::= <formulalist>| ε
rhs()
    if peekchar() == EOF
        return
    else
       formulalist()

// <formulalist> ::= <formula>|<formula> , <formulalist>
formulalist()
   formula()
   if peekchar() is ','
       comma = nextchar()
       return formulalist()

// <formula> ::= <letter>| - <formula>| (<formula><infixop><formula>)
formula()
    next = peekchar()
    if next in A..Z
        letter
    else if next is -
        minus_sign = nextchar()
        return formula()
    else
        formula()
        infixop()
        formula()

// <infixop> ::= & | | | >
infixop()
    c = nextchar()
    if c not in &,|,> then error

// <letter> ::= A | B | ... | Z
letter()
    c = nextchar()
    if c not A..Z then error

等等。

总体思路:

  • 每条规则都是一个函数,
  • 在某些点上该函数会向前查看要做什么。例如,formula() 检查第一个字符是否是减号。

The most straightforward approach is to make every rule (or "production") a function. This is called a "recursive descent" parser.

Write two routine that will peek at and get the next character as well.

This will give you some code that looks something like this (in pseudocode):

// <sequent> ::= <lhs> # <rhs>
sequent()
    lhs()
    if peekchar() != '#' then error
    else poundsign = nextchar()
    rhs()


// <lhs> ::= <formulalist>| ε
lhs()
    if peekchar() == EOF
        return
    else
       formula()

// <rhs> ::= <formulalist>| ε
rhs()
    if peekchar() == EOF
        return
    else
       formulalist()

// <formulalist> ::= <formula>|<formula> , <formulalist>
formulalist()
   formula()
   if peekchar() is ','
       comma = nextchar()
       return formulalist()

// <formula> ::= <letter>| - <formula>| (<formula><infixop><formula>)
formula()
    next = peekchar()
    if next in A..Z
        letter
    else if next is -
        minus_sign = nextchar()
        return formula()
    else
        formula()
        infixop()
        formula()

// <infixop> ::= & | | | >
infixop()
    c = nextchar()
    if c not in &,|,> then error

// <letter> ::= A | B | ... | Z
letter()
    c = nextchar()
    if c not A..Z then error

and so forth, for each rule.

The general idea:

  • each rule is a function
  • at certain points the function peeks ahead to see what to do. for example, formula() checks to see if the first character is a minus sign.
吃→可爱长大的 2024-09-09 10:39:13

由于您已经有了 BNF,解析此类输入的最简单方法是使用解析器生成器。但由于这是家庭作业,我不确定是否允许使用发电机。

尽管如此,您也可以使用手写的解析器。只需搜索递归下降解析器......

As you already have your BNF, the simplest way to parse this kind of input would be to use a parser generator. But due to this being homework, I'm not sure wether using a generator is allowed.

Nevertheless, you can also use a hand-written parser. Just do a search for recursive descent parsers...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文