这段代码可以优化吗?

发布于 2024-11-30 20:55:13 字数 573 浏览 4 评论 0原文

我想我对优化很着迷,所以我想知道下面的代码可以被“优化”:

假设我有一个 C 语言链表,当创建一个新元素时,我使用这个代码:

log_event_list_cur->next = 
    (struct log_event_list *)malloc(sizeof(struct log_event_list));

log_event_list_cur = log_event_list_cur->next;

我想知道下面的代码是否会可以:

log_event_list_cur = 
    log_event_list_cur->next = 
        (struct log_event_list *) malloc(sizeof(struct log_event_list));

或:

log_event_list_cur->next = 
    log_event_list_cur=(struct log_event_list *) malloc....

问候!

I think I'm getting obsessed with optimization so I wonder it the following code can be "optimized":

Let's say I have a C language linked list, and when creating a new element I use this code:

log_event_list_cur->next = 
    (struct log_event_list *)malloc(sizeof(struct log_event_list));

log_event_list_cur = log_event_list_cur->next;

I wonder if the following code would be ok:

log_event_list_cur = 
    log_event_list_cur->next = 
        (struct log_event_list *) malloc(sizeof(struct log_event_list));

or:

log_event_list_cur->next = 
    log_event_list_cur=(struct log_event_list *) malloc....

Regards!.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

叹倦 2024-12-07 20:55:13

是的,第一个很好(正确的代码,其行为与第一个代码相同),但优化也不是。它们将被编译成相同的机器代码。做你认为最可读的事情。

第二个行为不同,因为在设置 next 之前 log_event_list_cur 被分配给新的列表条目。

Yes,the first one is fine (correct code that behaves equivalently to the first code), but neither are optimizations. They will be compiled into the same machine code. Do what you find most readable.

The second one behaves differently since log_event_list_cur gets assigned to a new list entry before the next is set.

森罗 2024-12-07 20:55:13

它不会有任何区别。简单的赋值无论如何都会被编译器优化掉。

至少了解如何获取汇编代码的“要点”,以及如何从可执行文件中转储汇编输出。在 Linux 上,使用 objdump -S -d 将为您提供带有汇编程序的内联代码。

Its not going to make any difference. Simple assignments will be optimized away by the compiler anyway.

Learn at the very least how to get the "gist" of assembly code, and how to dump assembly output from your executable. On Linux, using objdump -S -d will give you inline code with assembler.

心奴独伤 2024-12-07 20:55:13

正如已经指出的,编译器可能会为所有三个版本发出相同的代码。

相反,如果您确实想让速度更快,请实现一个空闲列表,即保存当前未使用的列表项的第二个列表。这样,“分配”新成员意味着仅将项目从空闲列表中弹出(类似地,“释放”意味着简单地将项目推送到空闲列表上)。这样,您就不会为每个新的“分配”带来 malloc/free 开销。显然,如果空闲列表为空并且您需要分配一个新成员,则无论如何您都必须调用 malloc,但希望这种情况很少发生。

顺便说一句,我希望您忽略了对 malloc 返回值的检查。否则,如果 malloc 返回 NULL,您可能很快就会崩溃......

As already pointed out, the compiler will likely emit the same code for all the three versions.

Instead, if you really want to make it faster, implement a free-list, i.e. a second list that holds the currently unused list items. This way, "allocating" a new member means just popping an item out of the free-list (similarly "freeing" means simply pushing the item on the free-list). This way you don't have the malloc/free overhead for every single new "allocation". Obviously, if the free-list is empty and you need to allocate a new member, you'll have to call malloc anyway, but hopefully this will happen rarely.

BTW, I hope you just omitted the check for the return value of malloc. Otherwise, if malloc returns NULL, you'll likely get a crash soon afterward...

拥有 2024-12-07 20:55:13

不要再沉迷于优化,开始沉迷于可读性。 过早的优化是万恶之源。

第一个代码片段没问题。很清楚发生了什么。链表在当前条目之后获得一个新条目(大概是最后一个条目),然后当前条目向前移动成为最后一个条目。

理解第二个片段中发生的事情要困难得多。它最终与第一个示例相同,但需要付出精神努力才能确保它确实如此。

第三个片段是完全错误的,这是一个完美的例子,说明为什么您不应该在开始考虑优化之前就开始考虑。让它正确,然后然后使它快速,并且只有当您亲眼看到冷酷的分析器数据时才执行后者。

Stop being obsessed with optimization, start being obsessed with readability. Premature optimization is the root of all evil.

The first code fragment are OK. It is clear what's going on. The linked list gets a new entry after the current one (presumably it's the last one), and then the current is moved forward to become the last.

It is much harder to understand what's going on in the second fragment. It is ultimately the same thing as the first example, but one needs to apply mental effort to make sure it really is.

The third fragment is all wrong, and it's the perfect example of why you should not even start to think about starting to think about optimizations before it's time to do so. Make it right, then make it fast, and only do the latter if you have hard cold profiler data before your very own eyes.

[浮城] 2024-12-07 20:55:13

正如其他答案中提到的,编译器应该能够优化您所显示的任何差异;为了可读性,我可能会选择第一个,并且为了可读性,我可能会设置一个 #define (或在更高版本的 C 中设置一个 const) sizeof 调用的值,以及结构的 typedef 以稍微压缩大小。

[编辑:根据对这个问题的评论,甚至不需要强制转换,删除。]

[编辑:根据对问题 sizeof(log_node) 的另一条评论,作为 const 不再有任何意义;当它是 c_log_node_sizesizeof(struct log_event_list) 时,它(有点)做到了,但现在它完全是愚蠢的。 (还有其他很好的理由不这样做,如评论中所述,也许 sizeof_c_log_node 可以吗?不不不。):D]

typedef struct log_event_list log_node

然后它变成:

log_event_list_cur->next = malloc(sizeof(log_node));
log_event_list_cur = log_event_list_cur->next;

如果你想做一些优化,在你的这个特定部分代码,首先我建议您对系统进行一些分析,以确保它确实是瓶颈。如果它没有引起问题,那么就没有必要进行优化,您花在优化上的任何时间都最好花在其他地方,因为无论如何它都没有什么区别。也许有一些东西可以使用优化,但可能不是这个。但是,考虑到这个特定的代码块,唯一想到的就是优化块的分配。

我开始为此编写一些代码,但我没有留下它,因为我突然想到根本没有必要重新发明轮子,除了它很有趣之外。如果您发现优化该代码段很有价值,这里有一个 调整分配的可能性:vmalloc。我不会搞乱它,除非你能真正证明它是一个瓶颈。不过想想还是挺不错的。 :)

As mentioned in the other answers, the compiler should be able to optimize out any differences such as you show; I would probably go with the first for readability, and also for readability I'd probably either set up a #define (or a const in later versions of C) with the value for your sizeof call, and a typedef for your struct to compress the size a bit.

[edit: as per a comment on the question, the cast isn't even necessary, removed.]

[edit: as per another comment on the question sizeof(log_node) as a const no longer makes any sense at all; it (sort of) did when it was c_log_node_size vs. sizeof(struct log_event_list), but now it's just totally silly. (there are other good reasons to not do it as noted in the comment, maybe sizeof_c_log_node could be ok? no no no.) :D]

typedef struct log_event_list log_node

and then it becomes:

log_event_list_cur->next = malloc(sizeof(log_node));
log_event_list_cur = log_event_list_cur->next;

If you were wanting to do some optimization, on this particular part of your code, first I would suggest that you do some profiling of your system to make sure it's actually something that's a bottleneck. If it's not causing issues, then there's no need for optimizations, any time you spent optimizing would have been better spent elsewhere, since it didn't make a difference anyway. Probably there's something that could use optimization, it just might not be this. But, given this particular block of code the only thing that comes to mind is to optimize the allocation of your blocks.

I started to write some code for that, but I didn't leave it because it occurred to me that there's no need at all to reinvent the wheel on this one, other than that it's interesting. If you find that it would be valuable to optimize that bit of code, here's a possibility for tweaking your allocations: vmalloc. I wouldn't mess with it unless you could really demonstrate that it was a bottleneck however. It's kinda neat to think about though. :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文