如何加快大型案件陈述速度? C++
我正在运行一个文件并处理 30 种左右不同的片段类型。因此,每次我都会读入一个片段,并将其类型(十六进制)与我所知道的片段的类型进行比较。这很快吗?还是有其他方法可以更快地做到这一点?
这是我正在使用的代码示例:
// Iterate through the fragments and address them individually
for(int i = 0; i < header.fragmentCount; i++)
{
// Read in memory for the current fragment
memcpy(&frag, (wld + file_pos), sizeof(struct_wld_basic_frag));
// Deal with each frag type
switch(frag.id)
{
// Texture Bitmap Name(s)
case 0x03:
errorLog.OutputSuccess("[%i] 0x03 - Texture Bitmap Name", i);
break;
// Texture Bitmap Info
case 0x04:
errorLog.OutputSuccess("[%i] 0x04 - Texture Bitmap Info", i);
break;
// Texture Bitmap Reference Info
case 0x05:
errorLog.OutputSuccess("[%i] 0x05 - Texture Bitmap Reference Info", i);
break;
// Two-dimensional Object
case 0x06:
errorLog.OutputSuccess("[%i] 0x06 - Two-dimensioanl object", i);
break;
它运行大约 30 个片段,当有数千个片段时,它可能会有点卡顿。人们会如何建议我加快这一过程?
谢谢你!
I am running through a file and dealing with 30 or so different fragment types. So every time, I read in a fragment and compare it's type (in hex) with those of the fragments I know. Is this fast or is there another way I can do this quicker?
Here is a sample of the code I am using:
// Iterate through the fragments and address them individually
for(int i = 0; i < header.fragmentCount; i++)
{
// Read in memory for the current fragment
memcpy(&frag, (wld + file_pos), sizeof(struct_wld_basic_frag));
// Deal with each frag type
switch(frag.id)
{
// Texture Bitmap Name(s)
case 0x03:
errorLog.OutputSuccess("[%i] 0x03 - Texture Bitmap Name", i);
break;
// Texture Bitmap Info
case 0x04:
errorLog.OutputSuccess("[%i] 0x04 - Texture Bitmap Info", i);
break;
// Texture Bitmap Reference Info
case 0x05:
errorLog.OutputSuccess("[%i] 0x05 - Texture Bitmap Reference Info", i);
break;
// Two-dimensional Object
case 0x06:
errorLog.OutputSuccess("[%i] 0x06 - Two-dimensioanl object", i);
break;
It runs through about 30 of these and when there are thousands of fragments, it can chug a bit. How would one recommend I speed this process up?
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
如果除了格式字符串之外所有这些情况都相同,请考虑使用格式字符串数组,并且没有情况,如下所示:
这应该比 switch 便宜,因为它不会涉及分支预测错误。也就是说,此开关的成本可能低于实际格式化输出字符串的成本,因此您的优化工作可能有点错误。
If all of these cases are the same except for the format string, consider having a array of format strings, and no case, as in:
This should be less expensive than a switch, as it won't involve a branch misprediction penalty. That said, the cost of this switch is probably less than the cost of actually formatting the output string, so your optimization efforts may be a bit misplaced.
case 语句应该非常快,因为当您的代码经过优化(甚至有时没有优化)时,它会被实现为跳转表。进入调试器并在开关上放置断点并检查反汇编以确保情况确实如此。
The case statement should be very fast, because when your code is optimized (and even sometimes when it isn't) it is implemented as a jump table. Go into the debugger and put a breakpoint on the switch and check the disassembly to make sure that's the case.
我认为执行 memcpy 可能会造成很大的开销。也许可以使用 switch 语句直接访问 (wld + file_pos) 处的数据。
I think performing the memcpy is probably causing a lot of overhead. Maybe use your switch statement on a direct access to your data at (wld + file_pos).
我怀疑这 30 个案例陈述是否就是问题所在。与您的 memcpy 和 errorLog 方法正在执行的操作相比,这并不是很多代码。首先验证您的速度是否受到 CPU 时间而非磁盘访问的限制。如果您确实受 CPU 限制,请在分析器中检查代码。
I'm skeptical that the 30 case statements are the issue. That's just not very much code compared to whatever your memcpy and errorLog methods are doing. First verify that your speed is limited by CPU time and not by disk access. If you really are CPU bound, examine the code in a profiler.
如果您的片段标识符不太稀疏,您可以创建一个片段类型名称数组并将其用作查找表。
If your fragment identifiers aren't too sparse, you can create an array of fragment type names and use it as a lookup table.
如果您的日志语句始终是“[%i] 0xdd - message...”形式的字符串,并且 frag.id 始终是 0 到 30 之间的整数,则您可以声明一个字符串数组:
然后将 switch 语句替换为
If your log statements are always strings of the form "[%i] 0xdd - message..." and frag.id is always an integer between 0 and 30, you could instead declare an array of strings:
Then replace the switch statement with
如果可能的片段类型值都是连续的,并且您不想做比在匹配时打印字符串更复杂的事情,您可以只索引到数组中,例如:
If the possible fragment type values are all contiguous, and you don't want to do anything much more complex than printing a string upon matching, you can just index into an array, e.g.:
如果没有看到更多信息,就不可能确定,但看来您可以避免使用memcpy,而是使用指针来遍历数据。
目前,我假设了不同片段类型的字符串数组,正如 @Chris 和 @Ates 所建议的。即使在最坏的情况下,这也会提高可读性和可维护性,而不会影响速度。充其量,它可能(例如)提高缓存使用率,并显着提高速度——调用
errorlog.outputSuccess
的代码的一份副本而不是 30 个单独的副本可以为大量数据腾出空间。缓存中的其他“东西”。不过,避免每次都复制数据更有可能带来真正的好处。同时,我可能应该补充一点,这可能会导致问题 - 如果数据在原始缓冲区中未正确对齐,则尝试使用指针将不起作用。
It's impossible to say for sure without seeing more, but it appears that you can avoid the
memcpy
, and instead use a pointer to walk through the data.For the moment, I've assumed an array of strings for the different fragment types, as recommended by @Chris and @Ates. Even at worst, that will improve readability and maintainability without hurting speed. At best, it might (for example) improve cache usage, and give a major speed improvement -- one copy of the code to call
errorlog.outputSuccess
instead of 30 separate copies could make room for a lot of other "stuff" in the cache.Avoiding copying data every time is a lot more likely to do real good though. At the same time, I should probably add that it's possible for this to cause a problem -- if the data isn't correctly aligned in the original buffer, attempting to use the pointer won't work.