如何命名和组织有限状态机使用的方法?

发布于 2024-09-16 04:50:52 字数 2165 浏览 7 评论 0原文

在下面的代码中,您将看到一个符合以下正则表达式的简单词法分析器:

 \d*(\.\d*)?([eE]([+-]\d+|\d+))?

如果我要将这种设计用于更复杂的事情,那么维护所有匿名委托将是一场噩梦。我面临的最大挑战是如何命名充当状态机中选择点的方法。在变量 exponentPart 中,传递给 MatchOne 的最后一个匿名委托将决定我们是有符号整数、整数还是错误匹配。请发布有关我如何组织这样一个项目的任何想法,假设使用具有大量共享符号的复杂语言。

static void Main(string[] args)
{
    var exponentPart =
        Lex.Start()
        .MatchOne(s => s.Continue(s.Current == 'e' || s.Current == 'E'))
        .MatchOne(
            s => // What would I name this?
            {
                if (char.IsDigit(s.Current))
                {
                    return Lex.Start().MatchZeroOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
                }
                else if (s.Current == '+' || s.Current == '-')
                {
                    return Lex.Start().MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
                }
                else
                {
                    return s.RememberedState();
                }
            }
        );

    var fractionalPart =
        Lex.Start()
        .MatchOne(s => s.Continue(s.Current == '.'))
        .MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))
        .Remember()
        .MatchOne(exponentPart);

    var decimalLiteral =
        Lex.Start()
        .MatchOneOrMore(s => s.Continue(char.IsDigit(s.Current)))
        .Remember()
        .MatchOne(
            s => // What would I name this?
            {
                if (s.Current == '.')
                {
                    return fractionalPart(s);
                }
                else if (s.Current == 'e' || s.Current == 'E')
                {
                    return exponentPart(s);
                }
                else
                {
                    return s.RememberedState();
                }
            }
        );

    var input = "999.999e+999";
    var result = decimalLiteral(new LexState(input, 0, 0, 0, true));

    Console.WriteLine(result.Value.Substring(result.StartIndex, result.EndIndex - result.StartIndex + 1));
    Console.ReadLine();
}

In the following code you'll see a simple lexer that conforms to the following regular expression:

 \d*(\.\d*)?([eE]([+-]\d+|\d+))?

If I were to use this design for something more complex, all of the anonymous delegates would be a nightmare to maintain. The biggest challenge I am facing is what to name the methods that would act as choice points in the state machine. In the variable exponentPart the last anonymous delegate passed to MatchOne will decide whether we have a signed integer, an integer, or a false match. Please post any ideas on how I can organize such a project assuming a complex language with lots of shared symbols.

static void Main(string[] args)
{
    var exponentPart =
        Lex.Start()
        .MatchOne(s => s.Continue(s.Current == 'e' || s.Current == 'E'))
        .MatchOne(
            s => // What would I name this?
            {
                if (char.IsDigit(s.Current))
                {
                    return Lex.Start().MatchZeroOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
                }
                else if (s.Current == '+' || s.Current == '-')
                {
                    return Lex.Start().MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
                }
                else
                {
                    return s.RememberedState();
                }
            }
        );

    var fractionalPart =
        Lex.Start()
        .MatchOne(s => s.Continue(s.Current == '.'))
        .MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))
        .Remember()
        .MatchOne(exponentPart);

    var decimalLiteral =
        Lex.Start()
        .MatchOneOrMore(s => s.Continue(char.IsDigit(s.Current)))
        .Remember()
        .MatchOne(
            s => // What would I name this?
            {
                if (s.Current == '.')
                {
                    return fractionalPart(s);
                }
                else if (s.Current == 'e' || s.Current == 'E')
                {
                    return exponentPart(s);
                }
                else
                {
                    return s.RememberedState();
                }
            }
        );

    var input = "999.999e+999";
    var result = decimalLiteral(new LexState(input, 0, 0, 0, true));

    Console.WriteLine(result.Value.Substring(result.StartIndex, result.EndIndex - result.StartIndex + 1));
    Console.ReadLine();
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

听你说爱我 2024-09-23 04:50:52

当尝试编写某种解析器时,您应该首先将表达式划分为规则和终端。然后您可以根据方法检查的规则来命名方法。例如,类似以下内容:

<literal> := <fractional> | <fractional_with_exponent>
<fractional> := \d*(\.\d*)?
<fractional_with_exponent> := <fractional><exponent>
<exponent> := [eE]([+-]\d+|\d+)

这将为您提供名为 Literal()Fractional()FractionalWithExponent()的方法>Exponent() 每个都能够识别或拒绝自己的规则。 Literal() 会调用 Fractional() 和 FractionalWithExponent() 并决定哪一个不拒绝,等等。

When trying to write some sort of parser, you should first divide your expression into rules and terminals. Then you can name the methods by the rules they check. For example, something along the lines of:

<literal> := <fractional> | <fractional_with_exponent>
<fractional> := \d*(\.\d*)?
<fractional_with_exponent> := <fractional><exponent>
<exponent> := [eE]([+-]\d+|\d+)

This would give you methods named Literal(), Fractional(), FractionalWithExponent() and Exponent() each able to recognize or reject their own rules. Literal() would call Fractional() and FractionalWithExponent() and decide which one does not reject, etc.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文