使用有限状态机解析文件

发布于 2024-12-18 09:25:19 字数 546 浏览 10 评论 0原文

我正在实现我自己的 fsm 来解析文件。我是 fsm 模式的新手,所以想了解它。

我的 fsm 类采用正在解析的文件流以及当前状态和所有接受状态的集合。

现在我对几件事感到困惑。

  1. fsm 如何在状态之间移动并跟踪到目前为止已解析的内容?

  2. 状态对象应该存储什么信息?现在他们有一个在线匹配的模式,看看 fsm 是否可以移动到这个状态。

示例:

要解析的文件:

Person:  bob smith
        Age: 33
        Location: new York
End person
Person:  Jane smith
        Age: 66
        Location: Chicago
End person

所以我有一个人的开始、年龄、位置和结束人的状态。每个状态对象都有一个模式。 (正则表达式)检查给定的行是否被他们接受。

但我被困在使用 fsm 解析此文件时如何构造 Person 对象?

I am implementing my own fsm to parse a file. I am new to fsm pattern so trying to learn about it.

My fsm class takes a stream of the file that is being parsed along with the current state and a collection of all accepting states.

Now I am confused about couple of things.

  1. How does the fsm move through states and keep track of what has been parsed so far?

  2. What information should the state object store? Right now they have a pattern that they match on the line and see if fsm can move to this state or not.

Example:

File to parse:

Person:  bob smith
        Age: 33
        Location: new York
End person
Person:  Jane smith
        Age: 66
        Location: Chicago
End person

So I have a state for person start, age, location and end person. Each state object has a patter. (regex) to check if the given line is accepted by them or not.

But I am stuck on how would I construct a Person object when parsing this file using fsm??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

您的好友蓝忘机已上羡 2024-12-25 09:25:19

拥有一份人员名单(最初为空)。有一个 currentPerson 变量。

  • 当状态为“person start”时,将currentPerson变量初始化为新的Person。
  • 当状态为“age”时,将年龄设置到currentPerson中。
  • “位置”状态也是如此。
  • 当状态为“人员结束”时,将 currentPerson 添加到人员列表中。

当您到达文件末尾时,人员列表将包含您的所有人员。

Have a list of persons (empty initially). Have a currentPerson variable.

  • When the state is "person start", initialize the currentPerson variable to a new Person.
  • When the state is "age", set the age into the currentPerson.
  • Same for the "location" state.
  • When the state is "end of person", add the currentPerson to the list of persons.

When you reach the end of the file, the list of persons contains all your persons.

二智少女猫性小仙女 2024-12-25 09:25:19

我不认为这是 FSM 的最佳用途。

对我来说,这看起来非常像 JSON。进行一些更改即可。它也可以很容易地是 XML;您不必编写解析器。

但是,如果您坚持,您的 FSM 将从读取一行开始。

如果该行包含“Person”,您将保存名称值。 (建议:在“Person”后面添加“Name”行。)

如果该行包含“Age”,您将保存年龄值。

如果该行包含“位置”,您将保存位置值。

如果该行包含“End”,您将实例化一个新的 Person,将其添加到数据结构中,然后读取下一行。

如果该行为空,则表示已到达末尾;转换到结束状态并关闭文件。

您没有说明是否允许任何属性乱序。

I don't think I agree that this is the best use of FSM.

This looks an awful lot like JSON to me. A few changes and you're there. It could easily be XML too; you wouldn't have to write a parser.

But, if you insist, your FSM will start at reading a line.

If the line contains "Person", you'll save the name value. (Recommendation: add a "Name" line after "Person".)

If the line contains "Age", you'll save the age value.

If the line contains "Location", you'll save the location value.

If the line contains "End", you'll instantiate a new Person, add it to a data structure, and read the next line.

If the line is null you've reached the end; transition to the end state and close the file.

You don't say whether or not you allow any of the attributes out of order.

负佳期 2024-12-25 09:25:19

在 FSM 中构建状态的标准方法是在读取令牌时构建一棵树。 FSM 的状态取决于您当前所在的节点类型。例如,您首先解析单词“Person”,这样您就知道在树中构建一个新的“Person”节点。然后,您读到的所有内容,直到到达“最终人”标记,都会在该“人”下创建节点。

作为一项学术练习,这对于 FSM 来说听起来不错。但出于实际目的,这确实看起来像 JSON,所以我肯定会寻找现有的解析它的方法。

此外,yacc(或 bison)是构建 FSM 解析器的权威方法。它根据正式定义的语法输出 C 代码。我从来没有研究过它,但 Java 可能有类似的东西。

The standard way of building state in an FSM is to construct a tree as you read in tokens. The state of the FSM is determined by what kind of node you're currently under. For example, you'd start by parsing the word 'Person' and so you'd know to build a new 'Person' node in the tree. Then, everything you read past that, until you reach the 'End Person' tokens, creates nodes under that 'Person'.

As an academic exercise, this sounds good for an FSM. But for practical purposes, this does look like JSON, so I would definitely look for existing ways of parsing it.

Also, yacc (or bison) is the definitive way of building FSM parsers. It spits out C code given a formally defined grammar. I've never looked into it, but there is probably something similar out there for Java.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文