有限状态机的速度 - OO 与过程式
大家好,
我正在设计一个程序,该程序将从输入中接受一系列令牌并将它们提供给我设计的有限状态机。我设计了一个面向对象风格的测试有限状态机,具有机器本身的结构和转换等,但我正在编写的应用程序是速度非常重要的应用程序。
到目前为止,使用机器、添加新状态等,已被证明很容易,而且不是很复杂。它很容易理解,离开一个月再回来看代码也不会很迷失方向。然而,我不确定当前的面向对象方法的速度权衡是什么。对象的分配、数据的存储等是否会降低使用一堆标签和 goto 语句所获得的速度?
Hey all,
I am designing a program that will accept from input a series of tokens and feed them to a finite state machine which I have designed. I have designed a test finite state machine in object-oriented style, with structs for the machine itself and transitions, etc. but the application I am writing is one where speed is very important.
So far, using the machine, adding new states and the like, has proven to be easy and not very complicated. It is easy to comprehend and leaving for a month and coming back to the code will not be very disorienting. However, I am not sure what the speed trade off is with the current OO approach. Will the allocation of the objects, storage of data, etc. take away much of the speed that would be had by using a bunch of labels and goto
statements?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
不要从 OO 比函数式或过程式编程慢的角度考虑,而是从操作的角度考虑。调用函数、分支、获取、存储等......都需要一些时间,并且要了解每个解决方案的性能,您需要计算每个解决方案需要执行的工作量。
执行此操作的最佳方法是使用面向对象的测试解决方案并查看它是否足够快。如果没有,那就进行分析,找出哪些分支机构或商店花费最多,看看是否可以避免或简化它们。慢慢地重新构建您的解决方案,直到它满足您所追求的性能目标。在某些情况下,您可能必须采用更具功能性或程序性的风格。
最后,如果您编写了一个由数百个堆栈变量组成的函数并转到,那么您就错了。代码必须始终可读且可维护。
额外思考:您能否再花 5,000 美元购买更好的硬件,以避免 50,000 美元的开发成本优化?
Rather than thinking about it in terms of OO being slower than functional or procedural programming think about it in terms of operations. Calling a function, branching, fetching, storing etc... all take some time and to get an idea of the performance of each solution you'd need to tally up how much of each of these you'll need to do.
The best way to do this is to use your OO test solution and see if it's fast enough. If not then profile it, work out which branches or stores are costing you the most and see if you can avoid or streamline them. Slowly re-architect your solution until it meets the performance goals you're after. It may be that in some cases you'll have to adopt a more functional or procedural style.
Lastly if you do code up a single function that consists of hundreds of stack variables and goto you've done it wrong. Code must always be readable and maintainable.
Extra thought: Can you just spend another $5,000 on better hardware to avoid $50,000 in development costs optimising?
我更喜欢
结构化组织良好、可维护、可理解(即使离开一个月后,这也是相当好的质量)的代码,而不是“一堆标签和goto
语句”。话虽这么说,我还会在代码上运行分析器来检测瓶颈在进行速度分析时。I would prefer well
structuredorganized, maintainable, comprehendable (even after leaving for a month, thats quite a quality) code instead of a "bunch of labels andgoto
statements." Being said that, I would also run a profiler on the code to detect the bottlenecks when doing a speed analysis.简而言之,处理器在表查找上进行表查找的速度比分支更快。只要表易于处理并且足够小以适合缓存,您所描述的 OO 风格就会更快。 (不过,我肯定不会说这两种范式与这两种实现相关联。)
技术性更强一些:处理器具有 32-64 K 的 L1 缓存和几到几十兆字节 L2-L3。分支缓存最多可达几千字节。
Simply put, the processor is faster at doing table lookups upon table lookups than branches. What you describe as the OO style will be faster, as long as the table is tractable and small enough to be fit into cache. (I wouldn't certainly say that either paradigm is associated with either implementation, though.)
To be very moderately more technical: The processor has 32-64 K of L1 cache and a few to a couple dozen megabytes of L2-L3. The branch caches amount to a few kilobytes at most.
状态机多久改变一次?如果它在一段时间内保持不变,我所做的就是用我喜欢的任何语言将其翻译成硬编码例程。
状态机的转移图从何而来?它来自正则表达式吗?您知道您可以将其直接转换为结构化代码,而无需首先构建弧集。事实上,一开始编写该代码可能会更容易。
然后我要么只制作应用程序的这一部分,要么动态编译并将其链接到 dll 中并动态加载它。后者只需几秒钟。
有限状态机非常简单,基本上它们的运行时间只是加载 I/O 缓冲区所需时间的一小部分。他们不应该做任何不必要的内存分配、数据结构构建、任何面向对象的胡言乱语。
请记住,表示为一组状态和弧元组的有限状态机是一个理论模型。它的存在,就像图灵机一样,是为了证明它的定理。就像图灵机一样,在代码上不一定是好的实现技术。
只是为了向您展示我的意思,请考虑十进制整数的正则表达式:
其中“d”表示数字。作为有限状态机(元组),它会是:
作为代码,它会是:
看到这段代码与正则表达式的相似之处了吗?
不要将
c = getc()
视为“获取下一个字符 c”。将其视为“接受当前字符 c”。
我希望您能看到代码实际上没有比这更快的了。
如果您真的从 arc-set 开始,您可以生成以下代码:
这是意大利面条代码,但不比 arc-set 多,而且它将与结构化代码一样快。
(事实上,这或多或少就是编译器将结构化代码转换成的内容。)
How often does the state machine change? If it stays constant for a while, what I do is translate it into a hard-coded routine in whatever language I prefer.
Where does the transition diagram of the state machine come from? Does it come from a regular expression? You know you can translate that directly into structured code without first building an arc-set. In fact, it might just be easier to write that code to begin with.
Then I either just make that part of the app, or I dynamically compile and link it into a dll and dynamically load it. The latter takes just a couple seconds.
Finite state machines are so simple that they should basically run in a small fraction of the time it takes to load the I/O buffer. They shouldn't be doing any unnecessary memory allocation, data structure building, any of that OO hoo-haw.
Remember, a finite state machine represented as a set of states and arc tuples is a theoretical model. It exists, like Turing machines, in order to prove theorems about it. Just like a Turing machine, it is not necessarily a good implementation technique in code.
Just to show you what I mean, consider a regular expression for decimal integers:
where 'd' means digit. As a finite-state machine (tuples) it would be:
as code it would be:
See the similarity of this code with the regular expression?
Don't think of
c = getc()
as "get the next character c".Think of it as "accept the current character c".
I hope you can see the code can't really get much faster than this.
If you're really starting from an arc-set, you could generate this:
which is spaghetti code, but not any more than the arc-set is, and it will be just as fast as the structured code.
(In fact, that's more or less what the compiler converts your structured code into.)
我是一名应用程序性能工程师,以下是我对有限状态机性能的看法。请注意,这些只是意见,不是我测量的事实,但我认为这对某些人有用:
这些是一般建议。现在,根据您的具体情况,您还有一些建议:
I work as an application performance engineer, and here is my opinion about performance of finite state machines. Please note that these are opinions, not facts that I measured, but I think it would be useful for some people:
These are general recommendations. Now depending on your specific case you have a few more recommendations: