当源代码不可用时,什么是好的分析工具?
我有一个大问题。 我的老板对我说他想要两个“神奇黑匣子”: 1- 接收微处理器的东西,如输入和返回,如输出、MIPS 和/或 MFLOPS。 2- 接收交流代码(如输入和返回)(如输出)的东西,可以在性能方面表征代码的东西(类似于 uP 在某个时间执行代码所需的必要 MIPS)。
因此,我认为第一个“黑匣子”可能是 EEMBC 或 SPEC 的基准...不同的 uP,返回每个 uP 的 MIPS/MFLOPS 的相同基准。 第一个问题是好的(我希望)
但是第二个问题......第二个黑匣子是我的噩梦......我发现的唯一的事情就是使用分析工具,但我要求一个特定的分析工具。 是否有人知道一种分析工具,它可以具有(如输入)简单的 C 代码,并为我提供(如输出)我的 C 代码的性能特征(或调用某些汇编指令的时间)?
真正的问题是我们必须为某些 c 代码选择正确的 uP...但是我们想要一个为我们的 c 代码量身定制的 uP...所以如果我们知道 MIPS(以及 uP 的体系结构、内存结构... )以及我们的代码需要什么
谢谢大家
I have a big problem. My boss said to me that he wants two "magic black box":
1- something that receives a micropocessor like input and return, like output, the MIPS and/or MFLOPS.
2- something that receives a c code like input and return, like output, something that can characterize the code in term of performance (something like the necessary MIPS that a uP must have to execute the code in some time).
So the first "black box" I think could be a benchmark of EEMBC or SPEC...different uP, same benchmark that returns MIPS/MFLOPS of each uP. The first problem is OK (I hope)
But the second...the second black box is my nightmare...the only thingh that i find is to use profiling tool but I ask a particular profiling tool.
Is there somebody that know a profiling tool that can have, like input, simple c code and gives me, like output, the performance characteristics of my c code (or the times that some assembly instruction is called)?
The real problem is that we must choose the correct uP for a certai c code...but we want a uP tailored for our c code...so if we know a MIPS (and architectural structure of uP, memory structure...) and what our code needed
Thanks to everyone
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我必须同意亚当的观点,尽管我会对此更加仁慈一点。 编译器优化仅在热点代码中很重要,即a)不调用函数,b)占用大量时间的紧密循环。
从积极的角度来看,我的建议是:
您可以为此使用分析器。 我更喜欢的简单方法是在调试器下运行它并手动停止它一定次数(例如 10 次),并且每次都写下调用堆栈。 我想代码中有些东西占用了很大一部分时间,比如 50%。 如果是这样,您将看到它在大约该百分比的样本上执行该操作,因此您不必猜测它是什么。
重要的是不要猜测。 如果你说“我认为这需要一个DSP芯片”,或者“我认为它需要一个多核芯片”,那就是一个猜测。 猜测可能是对的,但也可能不是。 情况可能是,最耗时的事情是你永远猜不到的,比如内存管理或 I/O 格式化。 性能问题非常善于隐藏起来。
I have to agree with Adam, though I would be a little more gracious about it. Compiler optimizations only matter in hotspot code, i.e. tight loops that a) don't call functions, and b) take a large percentage of time.
On a positive note, here's what I would suggest:
You could use a profiler for this. The simple method I prefer is to just run it under a debugger and manually halt it, some number of times (like 10) and each time write down the call stack. I suppose there is something in the code taking a good percentage of the time, like 50%. If so, you will see it doing that thing on roughly that percentage of samples, so you won't have to guess what it is.
It is important not to guess. If you say "I think this needs a DSP chip", or "I think it needs a multi-core chip", that is a guess. The guess might be right, but probably not. It is probably the case that what takes the most time is something you never would guess, like memory management or I/O formatting. Performance issues are very good at hiding from you.
不。如果有人制作了一个可以分析(重要的)源代码并告诉您其性能特征的工具,那么它会很常见。 即每个人都会使用它。
在针对特定目标架构编译源代码之前,您将无法确定其整体性能。 例如,一个针对 n 个处理器的并行编译器可能能够将 O(n^2) 算法更改为 O(n) 之一。
No. If someone made a tool that could analyse (non-trivial) source code and tell you its performance characteristics, it would be common place. i.e. everyone would be using it.
Until source code is compiled for a particular target architecture, you will not be able to determine its overall performance. For instance, a parallelising compiler targeting n processors might conceivably be able to change an O(n^2) algorithm to one of O(n).
您找不到可以做您想做的事情的工具。
您唯一的选择是交叉编译代码并在模拟器上针对您正在运行的体系结构对其进行分析。 分析高级代码的问题是编译器进行了一系列非平凡的优化,您需要知道特定编译器是如何做到这一点的。
这听起来很愚蠢,但是为什么你想让你的代码适合一个 uP,另一个 uP 适合你的代码呢? 如果您正在编写信号处理程序,请购买 DSP。 如果您正在构建 SCADA 盒子,那么请考虑 Atmel 或 ARM 的产品。 您正在构建具有用户界面的通用设备吗? 研究一下 PPC 或 X86 兼容的东西。
简而言之,选择一个适合并提供您需要的功能的该死的架构。 选择处理器之前的优化会被延迟(非常粗略地解释 Knuth)。
将架构固定在大致合适的位置,大致计算出处理要求(您可以手动估算,在查看 C 代码时,该估计总是太高),然后购买一个 uP 来匹配。
You won't find a tool to do what you want.
Your only option is to cross-compile the code and profile it on an emulator for the architecture you're running. The problem with profiling high level code is the compiler makes a stack of optimizations that are non trivial and you'd need to know how the particular compiler did that.
It sounds dumb, but why do you want to fit your code to a uP and a uP to your code? If you're writing signal processing buy a DSP. If you're building a SCADA box then look into Atmel or ARM stuff. Are you building a general purpose appliance with a user interface? Look into PPC or X86 compatible stuff.
Simply put, choose a bloody architecture that's suitable and provides the features you need. Optimization before choosing the processor is retarded (very roughly paraphrasing Knuth).
Fix the architecture at something roughly appropriate, work out roughly the processing requirements (you can scratch up an estimate by hand which will always be too high when looking at C code) and buy a uP to match.