等效指令数
我有一个问题(就像我一样)...
但是...如果我选择了用 C 或 C++ 或任何您想要的代码编写的算法...修复了编译器,我可以确定指令的数量,但这些指令彼此不同:x ADD,y MUL,z MOV,f FADD,t FMUL(F代表浮动)...是否有方法或方程式或其他东西允许以“等效”的数量写入指令数指令”来比较不同的算法? 你们中有人使用这种类型的指标吗? 这是垃圾吗?
谢谢
马可
第二部分: 我知道这通常取决于 uP 和架构。 我的问题是:确定在不同软核架构上实现的不同算法的执行时间。 在 y 轴上,我必须写下时间,在 x 轴上,指令数量和图表点由架构类型参数化(请原谅我的英语)。 但在 x-axix 上,我认为最好使用“等效指令”数量之类的东西......
这是一个垃圾想法吗?
I've a question (just like me)...
but...if I've a choosen algorithm written in C or C++ or whatever code you want...fixed a compiler I can determine the number of instructions but these intructions are different each other: x ADD, y MUL, z MOV, f FADD, t FMUL (F stands for FLOATING)...Is there a methodology or equation or something else that permits to write the number of instructions in number of "Equivalent instruction" to compare different algorith? Is there somebody of you that use this type of metric? is it a rubbish?
Thanks
Marco
Part2:
I Know that it dipends on uP and architecture in general. My problem is: To determine a execution time of different algorithms implemented on different architectures of soft core. On y-axis I must write time, on x-axis the number of instruction and the point of the graph are parametrized by the type of architecture (excuse me for my english). But on x-axix I think it's better to use something like number of "equivalent instruction"...
Is it a rubbish idea?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你不太明白问题所在。 执行速度不仅取决于指令,还取决于指令间的依赖性。 微处理器可以同时执行多条指令,前提是这些指令彼此不依赖。 一次执行多条指令的能力因处理器系列而异。 这就是为什么这个任务实际上是特定于硬件的,它不能一劳永逸地解决。
您所能做的就是绘制指令和处理器周期的执行时间线。 处理器周期可以是 y 轴,指令可以是 x 轴。 您将在预测缓存命中和未命中时遇到问题,并且许多指令的执行时间将根据缓存命中/未命中而有很大差异。 准备好花费大量时间阅读处理器手册。
You don't quite understand the problem. Execution speed depends not only on instructions but on inter-instruction dependencies also. Microprocessors can execute several instructions at the same time given this instructions don't depend on each other. The ability to execute several instructions at a time differs from one processor family to another. That's why this task is really hardware-specific, it can't be solved once and for all.
All you can do is graph an execution timeline of instructions and processor cycles. Processor cycles could be y-axis, instructions could be x-axis. You'll have problems predicting cache hits and misses and execution time of many instructions will vary greatly depending on cache hits/misses. Be ready to spend a lot of time with processors manuals.
它必须考虑管道和各种其他复杂问题,其中许多问题会因处理器而异。 换句话说,即使可行,我也不认为它特别有用。
还有一些算法无法告诉您的事情,例如会有多少缓存未命中等 - 这些可能比原始指令计数重要得多。
It would have to take into account pipelining and all kinds of other intricacies, many of which will vary by processor. In other words, I can't see it being particularly useful even if it's feasible.
There are also things which the algorithm wouldn't be able to tell you, like how many cache misses there'll be etc - these could be much more important than the raw instruction count.
这不是垃圾,只是含糊不清。 从算法到源代码到对象代码再到核心......有很多细节需要确定,每个细节都可能对性能产生重大影响。
看看轩尼诗& 帕特森的“计算机体系结构,一种定量方法”
It's not rubbish, it's just vague. To go from Algorithm to SOurce code to Object COde to core... so many details to nail down, each of which can have significant performance implications.
Have a look at Hennessey & Patterson's "Computer Architecture, A Quantitative Approach"