如何命名和组织有限状态机使用的方法?
在下面的代码中,您将看到一个符合以下正则表达式的简单词法分析器:
\d*(\.\d*)?([eE]([+-]\d+|\d+))?
如果我要将这种设计用于更复杂的事情,那么维护所有匿名委托将是一场噩梦。我面临的最大挑战是如何命名充当状态机中选择点的方法。在变量 exponentPart
中,传递给 MatchOne
的最后一个匿名委托将决定我们是有符号整数、整数还是错误匹配。请发布有关我如何组织这样一个项目的任何想法,假设使用具有大量共享符号的复杂语言。
static void Main(string[] args)
{
var exponentPart =
Lex.Start()
.MatchOne(s => s.Continue(s.Current == 'e' || s.Current == 'E'))
.MatchOne(
s => // What would I name this?
{
if (char.IsDigit(s.Current))
{
return Lex.Start().MatchZeroOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
}
else if (s.Current == '+' || s.Current == '-')
{
return Lex.Start().MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
}
else
{
return s.RememberedState();
}
}
);
var fractionalPart =
Lex.Start()
.MatchOne(s => s.Continue(s.Current == '.'))
.MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))
.Remember()
.MatchOne(exponentPart);
var decimalLiteral =
Lex.Start()
.MatchOneOrMore(s => s.Continue(char.IsDigit(s.Current)))
.Remember()
.MatchOne(
s => // What would I name this?
{
if (s.Current == '.')
{
return fractionalPart(s);
}
else if (s.Current == 'e' || s.Current == 'E')
{
return exponentPart(s);
}
else
{
return s.RememberedState();
}
}
);
var input = "999.999e+999";
var result = decimalLiteral(new LexState(input, 0, 0, 0, true));
Console.WriteLine(result.Value.Substring(result.StartIndex, result.EndIndex - result.StartIndex + 1));
Console.ReadLine();
}
In the following code you'll see a simple lexer that conforms to the following regular expression:
\d*(\.\d*)?([eE]([+-]\d+|\d+))?
If I were to use this design for something more complex, all of the anonymous delegates would be a nightmare to maintain. The biggest challenge I am facing is what to name the methods that would act as choice points in the state machine. In the variable exponentPart
the last anonymous delegate passed to MatchOne
will decide whether we have a signed integer, an integer, or a false match. Please post any ideas on how I can organize such a project assuming a complex language with lots of shared symbols.
static void Main(string[] args)
{
var exponentPart =
Lex.Start()
.MatchOne(s => s.Continue(s.Current == 'e' || s.Current == 'E'))
.MatchOne(
s => // What would I name this?
{
if (char.IsDigit(s.Current))
{
return Lex.Start().MatchZeroOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
}
else if (s.Current == '+' || s.Current == '-')
{
return Lex.Start().MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))(s.Continue(true));
}
else
{
return s.RememberedState();
}
}
);
var fractionalPart =
Lex.Start()
.MatchOne(s => s.Continue(s.Current == '.'))
.MatchOneOrMore(s1 => s1.Continue(char.IsDigit(s1.Current)))
.Remember()
.MatchOne(exponentPart);
var decimalLiteral =
Lex.Start()
.MatchOneOrMore(s => s.Continue(char.IsDigit(s.Current)))
.Remember()
.MatchOne(
s => // What would I name this?
{
if (s.Current == '.')
{
return fractionalPart(s);
}
else if (s.Current == 'e' || s.Current == 'E')
{
return exponentPart(s);
}
else
{
return s.RememberedState();
}
}
);
var input = "999.999e+999";
var result = decimalLiteral(new LexState(input, 0, 0, 0, true));
Console.WriteLine(result.Value.Substring(result.StartIndex, result.EndIndex - result.StartIndex + 1));
Console.ReadLine();
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当尝试编写某种解析器时,您应该首先将表达式划分为规则和终端。然后您可以根据方法检查的规则来命名方法。例如,类似以下内容:
这将为您提供名为
Literal()
、Fractional()
、FractionalWithExponent()
和的方法>Exponent()
每个都能够识别或拒绝自己的规则。 Literal() 会调用 Fractional() 和 FractionalWithExponent() 并决定哪一个不拒绝,等等。When trying to write some sort of parser, you should first divide your expression into rules and terminals. Then you can name the methods by the rules they check. For example, something along the lines of:
This would give you methods named
Literal()
,Fractional()
,FractionalWithExponent()
andExponent()
each able to recognize or reject their own rules. Literal() would call Fractional() and FractionalWithExponent() and decide which one does not reject, etc.