实现简单 FSM 的 Pythonic 方法是什么？

发布于 2024-09-02 22:26:25 字数 1030 浏览 7 评论 0原文

昨天我必须解析一个非常简单的二进制数据文件 - 规则是，在一行中查找两个字节都是 0xAA，然后下一个字节将是一个长度字节，然后跳过 9 个字节并输出给定数量的数据那里。重复到文件末尾。

我的解决方案确实有效，并且很快就组合在一起了（尽管我本质上是一名 C 程序员，但我仍然认为用 Python 编写这个比用 C 编写要快） - 但是，很明显根本不是Pythonic，它读起来就像一个C程序（而且不是一个很好的程序！）

有什么更好/更Pythonic的方法来解决这个问题？像这样的简单 FSM 在 Python 中仍然是正确的选择吗？

我的解决方案：

#! /usr/bin/python

import sys

f = open(sys.argv[1], "rb")

state = 0

if f:
    for byte in f.read():
        a = ord(byte)       
        if state == 0:
            if a == 0xAA:
                state = 1
        elif state == 1:
            if a  == 0xAA:
                state = 2
            else: 
                state = 0
        elif state == 2:
            count = a;
            skip = 9
            state = 3
        elif state == 3:
            skip = skip -1
            if skip == 0:
                state = 4
        elif state == 4:
             print "%02x" %a
             count = count -1 
             if count == 0:
                 state = 0
                 print "\r\n"

原文

Yesterday I had to parse a very simple binary data file - the rule is, look for two bytes in a row that are both 0xAA, then the next byte will be a length byte, then skip 9 bytes and output the given amount of data from there. Repeat to the end of the file.

My solution did work, and was very quick to put together (even though I am a C programmer at heart, I still think it was quicker for me to write this in Python than it would have been in C) - BUT, it is clearly not at all Pythonic and it reads like a C program (and not a very good one at that!)

What would be a better / more Pythonic approach to this? Is a simple FSM like this even still the right choice in Python?

My solution:

#! /usr/bin/python

import sys

f = open(sys.argv[1], "rb")

state = 0

if f:
    for byte in f.read():
        a = ord(byte)       
        if state == 0:
            if a == 0xAA:
                state = 1
        elif state == 1:
            if a  == 0xAA:
                state = 2
            else: 
                state = 0
        elif state == 2:
            count = a;
            skip = 9
            state = 3
        elif state == 3:
            skip = skip -1
            if skip == 0:
                state = 4
        elif state == 4:
             print "%02x" %a
             count = count -1 
             if count == 0:
                 state = 0
                 print "\r\n"

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

紙鸢 2024-09-09 22:26:25

我见过的在 Python 中实现 FSM 的最酷的方法是通过生成器和协程。有关示例，请参阅此迷人的 Python 帖子。 Eli Bendersky 还拥有出色的受试者的治疗。

如果协程不是您熟悉的领域，David Beazley 的关于协程和并发的好奇课程是一门明星介绍。

回复收藏 0 原文

扎心 2024-09-09 22:26:25

您可以为状态指定常量名称，而不是使用 0、1、2 等，以提高可读性。

您可以使用字典来映射 (current_state, input) -> (next_state)，但这并不能真正让您在转换期间进行任何额外的处理。除非您也包含一些“转换函数”来进行额外的处理。

或者您可以采用非 FSM 方法。我认为只要 0xAA 0xAA 仅在指示“开始”时出现（不出现在数据中），这就会起作用。

with open(sys.argv[1], 'rb') as f:
    contents = f.read()
    for chunk in contents.split('\xaa\xaa')[1:]:
        length = ord(chunk[0])
        data = chunk[10:10+length]
        print data

如果它确实出现在数据中，您可以使用 string.find('\xaa\xaa', start) 扫描字符串，并将 start 参数设置为 begin查看最后一个数据块的结束位置。重复直到返回-1。

You could give your states constant names instead of using 0, 1, 2, etc. for improved readability.

You could use a dictionary to map (current_state, input) -> (next_state), but that doesn't really let you do any additional processing during the transitions. Unless you include some "transition function" too to do extra processing.

Or you could do a non-FSM approach. I think this will work as long as 0xAA 0xAA only appears when it indicates a "start" (doesn't appear in data).

with open(sys.argv[1], 'rb') as f:
    contents = f.read()
    for chunk in contents.split('\xaa\xaa')[1:]:
        length = ord(chunk[0])
        data = chunk[10:10+length]
        print data

If it does appear in data, you can instead use string.find('\xaa\xaa', start) to scan through the string, setting the start argument to begin looking where the last data block ended. Repeat until it returns -1.

回复收藏 0 原文

九歌凝 2024-09-09 22:26:25

我有点担心告诉任何人什么是 Pythonic，但这里是。首先，请记住，在 python 中，函数只是对象。可以使用以 (input, current_state) 作为键、以元组 (next_state, action) 作为值的字典来定义转换。动作只是一个函数，它执行从当前状态转换到下一个状态所需的任何操作。

http://code.activestate 有一个很好看的示例.com/recipes/146262-finite-state-machine-fsm。我没有使用过它，但快速阅读一下，它似乎涵盖了所有内容。

几个月前，有人在这里提出/回答了类似的问题：Python 状态机设计。您可能会发现查看这些回复也很有用。

回复收藏 0 原文

千柳 2024-09-09 22:26:25

我认为您的解决方案看起来不错，除了您应该将 count = count - 1 替换为 count -= 1 。

这是那些花哨的代码炫耀会提出用字典将状态映射到可调用的方法，并带有一个小的驱动程序函数的时代之一，但它并不是更好，只是更奇特，并且使用了更晦涩的语言功能。

回复收藏 0 原文

傾旎 2024-09-09 22:26:25

我建议查看 David Mertz 撰写的 Python 文本处理第 4 章。他用Python实现了一个非常优雅的状态机类。

回复收藏 0 原文

百变从容 2024-09-09 22:26:25

我认为最Pythonic的方式就像FogleBird建议的那样，但是从（当前状态，输入）映射到一个可以处理处理和转换的函数。

回复收藏 0 原文

温柔戏命师 2024-09-09 22:26:25

您可以使用正则表达式。像这样的代码将找到第一个数据块。那么就只是从上一次匹配之后开始下一次搜索的情况。

find_header = re.compile('\xaa\xaa(.).{9}', re.DOTALL)
m = find_header.search(input_text)
if m:
    length = chr(find_header.group(1))
    data = input_text[m.end():m.end() + length]

You can use regexps. Something like this code will find the first block of data. Then it's just a case of starting the next search from after the previous match.

find_header = re.compile('\xaa\xaa(.).{9}', re.DOTALL)
m = find_header.search(input_text)
if m:
    length = chr(find_header.group(1))
    data = input_text[m.end():m.end() + length]

回复收藏 0 原文

~没有更多了~