使用有限状态机解析文件
我正在实现我自己的 fsm 来解析文件。我是 fsm 模式的新手,所以想了解它。
我的 fsm 类采用正在解析的文件流以及当前状态和所有接受状态的集合。
现在我对几件事感到困惑。
fsm 如何在状态之间移动并跟踪到目前为止已解析的内容?
状态对象应该存储什么信息?现在他们有一个在线匹配的模式,看看 fsm 是否可以移动到这个状态。
示例:
要解析的文件:
Person: bob smith
Age: 33
Location: new York
End person
Person: Jane smith
Age: 66
Location: Chicago
End person
所以我有一个人的开始、年龄、位置和结束人的状态。每个状态对象都有一个模式。 (正则表达式)检查给定的行是否被他们接受。
但我被困在使用 fsm 解析此文件时如何构造 Person 对象?
I am implementing my own fsm to parse a file. I am new to fsm pattern so trying to learn about it.
My fsm class takes a stream of the file that is being parsed along with the current state and a collection of all accepting states.
Now I am confused about couple of things.
How does the fsm move through states and keep track of what has been parsed so far?
What information should the state object store? Right now they have a pattern that they match on the line and see if fsm can move to this state or not.
Example:
File to parse:
Person: bob smith
Age: 33
Location: new York
End person
Person: Jane smith
Age: 66
Location: Chicago
End person
So I have a state for person start, age, location and end person. Each state object has a patter. (regex) to check if the given line is accepted by them or not.
But I am stuck on how would I construct a Person object when parsing this file using fsm??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
拥有一份人员名单(最初为空)。有一个 currentPerson 变量。
currentPerson
变量初始化为新的Person。currentPerson
中。当您到达文件末尾时,人员列表将包含您的所有人员。
Have a list of persons (empty initially). Have a
currentPerson
variable.currentPerson
variable to a new Person.currentPerson
.When you reach the end of the file, the list of persons contains all your persons.
我不认为这是 FSM 的最佳用途。
对我来说,这看起来非常像 JSON。进行一些更改即可。它也可以很容易地是 XML;您不必编写解析器。
但是,如果您坚持,您的 FSM 将从读取一行开始。
如果该行包含“Person”,您将保存名称值。 (建议:在“Person”后面添加“Name”行。)
如果该行包含“Age”,您将保存年龄值。
如果该行包含“位置”,您将保存位置值。
如果该行包含“End”,您将实例化一个新的 Person,将其添加到数据结构中,然后读取下一行。
如果该行为空,则表示已到达末尾;转换到结束状态并关闭文件。
您没有说明是否允许任何属性乱序。
I don't think I agree that this is the best use of FSM.
This looks an awful lot like JSON to me. A few changes and you're there. It could easily be XML too; you wouldn't have to write a parser.
But, if you insist, your FSM will start at reading a line.
If the line contains "Person", you'll save the name value. (Recommendation: add a "Name" line after "Person".)
If the line contains "Age", you'll save the age value.
If the line contains "Location", you'll save the location value.
If the line contains "End", you'll instantiate a new Person, add it to a data structure, and read the next line.
If the line is null you've reached the end; transition to the end state and close the file.
You don't say whether or not you allow any of the attributes out of order.
在 FSM 中构建状态的标准方法是在读取令牌时构建一棵树。 FSM 的状态取决于您当前所在的节点类型。例如,您首先解析单词“Person”,这样您就知道在树中构建一个新的“Person”节点。然后,您读到的所有内容,直到到达“最终人”标记,都会在该“人”下创建节点。
作为一项学术练习,这对于 FSM 来说听起来不错。但出于实际目的,这确实看起来像 JSON,所以我肯定会寻找现有的解析它的方法。
此外,yacc(或 bison)是构建 FSM 解析器的权威方法。它根据正式定义的语法输出 C 代码。我从来没有研究过它,但 Java 可能有类似的东西。
The standard way of building state in an FSM is to construct a tree as you read in tokens. The state of the FSM is determined by what kind of node you're currently under. For example, you'd start by parsing the word 'Person' and so you'd know to build a new 'Person' node in the tree. Then, everything you read past that, until you reach the 'End Person' tokens, creates nodes under that 'Person'.
As an academic exercise, this sounds good for an FSM. But for practical purposes, this does look like JSON, so I would definitely look for existing ways of parsing it.
Also, yacc (or bison) is the definitive way of building FSM parsers. It spits out C code given a formally defined grammar. I've never looked into it, but there is probably something similar out there for Java.