以 NULL 结尾的字符串及其长度

发布于 2024-10-06 04:02:58 字数 406 浏览 3 评论 0原文

我有一个遗留代码,它接收一些专有信息,解析它并创建一堆静态字符数组(嵌入在表示消息的类中),以表示 NULL 字符串。然后,指向字符串的指针被四处传递,最后序列化到某个缓冲区。

分析显示 str*() 方法需要花费大量时间。

因此我想使用 memcpy() 是否可能。为了实现它,我需要一种将长度与指向 NULL 终止字符串的指针相关联的方法。我认为:

  • 使用 std::string 看起来效率较低,因为它需要内存分配和线程同步。

  • 我可以使用std::pair<指向字符串的指针,长度>。但在这种情况下,我需要“手动”维护长度。

你怎么认为?

I have a legacy code that receives some proprietary, parses it and creates a bunch of static char arrays (embedded in class representing the message), to represent NULL strings. Afterwards pointers to the string are passed all around and finally serialized to some buffer.

Profiling shows that str*() methods take a lot of time.

Therefore I would like to use memcpy() whether it's possible. To achive it I need a way to associate length with pointer to NULL terminating string. I though about:

  • Using std::string looks less efficient, since it requires memory allocation and thread synchronization.

  • I can use std::pair<pointer to string, length>. But in this case I need to maintain length "manually".

What do you think?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

﹂绝世的画 2024-10-13 04:02:58

使用 std::string

use std::string

神仙妹妹 2024-10-13 04:02:58

分析显示 str*() 方法
需要很多时间

当然他们这样做......在任何阵列上操作都需要很多时间。

因此我想使用 memcpy()
是否可能。为了实现它我
需要一种将长度与
指向 NULL 终止字符串的指针。我
尽管关于:

memcpy实际上并不比strcpy慢。事实上,如果您执行 strlen 来确定您要使用 memcpy 的量,那么 strcpy 几乎肯定会更快。

使用 std::string 看起来更少
高效,因为它需要内存
分配和线程同步

它可能看起来效率较低,但有很多比你或我更好的人已经在这方面进行了研究

我可以使用 std::pair。但在这种情况下我需要
“手动”保持长度。

这是节省长度计算时间的一种方法。显然你需要手动维护长度。这就是 Windows BSTR 的有效工作方式(尽管长度是在内存中紧邻实际字符串数据之前存储的)。 std::string。例如,已经这样做了......

你觉得怎么样?

我认为你的问题问得很糟糕。没有提出真正的问题,这使得回答几乎不可能。我建议你以后实际提出具体问题。

Profiling shows that str*() methods
take a lot of time

Sure they do ... operating on any array takes a lot of time.

Therefore I would like to use memcpy()
whether it's possible. To achive it I
need a way to associate length with
pointer to NULL terminating string. I
though about:

memcpy is not really any slower than strcpy. In fact if you perform a strlen to identify how much you are going to memcpy then strcpy is almost certainly faster.

Using std::string looks less
efficient, since it requires memory
allocation and thread synchronization

It may look less efficient but there are a lot of better minds than yours or mine that have worked on it

I can use std::pair. But in this case I need to
maintain length "manually".

thats one way to save yourself time on the length calculation. Obviously you need to maintain the length manually. This is how windows BSTRs work, effectively (though the length is stored immediately prior, in memory, to the actual string data). std::string. for example, already does this ...

What do you think?

I think your question is asked terribly. There is no real question asked which makes answering next to impossible. I advise you actually ask specific questions in the future.

七月上 2024-10-13 04:02:58

使用std::string。这是已经给出的建议,但让我解释一下原因:

第一,它使用自定义内存分配方案。您的 char* 字符串可能已进行 malloc。这意味着它们是在最坏情况下对齐的,这对于 char[] 来说确实不需要。 std::string 不会遭受不必要的对齐。此外,std::string 的常见实现使用“小字符串优化”,它完全消除了堆分配,并提高了引用的局部性。字符串大小将与 char[] 本身位于同一缓存行。

第二,它保留了字符串长度,这确实是一种速度优化。大多数 str* 函数速度较慢,因为它们事先没有此信息。

第二个选项是一个rope 类,例如来自SGI。通过消除一些字符串副本可以提高效率。

Use std::string. It's an advice already given, but let me explain why:

One, it uses a custom memory allocation scheme. Your char* strings are probably malloc'ed. That means they are worst-case aligned, which really isn't needed for a char[]. std::string doesn't suffer from needless alignment. Furthermore, common implementatios of std::string use the "Small String Optimization" which eliminates a heap allocation altogether, and improves locality of reference. The string size will be on the same cache line as the char[] itself.

Two, it keeps the string length, which is indeed a speed optimization. Most str* functions are slower because they don't have this information up front.

A second option would be a rope class, e.g. from SGI. This be more efficient by eliminating some string copies.

听闻余生 2024-10-13 04:02:58

您的帖子没有解释 str*() 函数调用的来源;传递 char * 肯定不会调用它们。确定实际进行字符串操作的站点,然后尝试找出它们是否效率低下。一个常见的陷阱是 strcat 首先需要扫描目标字符串以查找终止 0 字符。如果连续多次调用 strcat ,最终可能会得到 O(N^2) 算法,因此请小心这一点。

strcpy 替换为 memcpy 不会产生任何显着差异; strcpy 不会执行额外的操作来查找字符串的长度,它只是(概念上!)一个字符一个字符的复制,当遇到终止 0 时停止。这仅此而已比 memcpy 贵,并且总是比 strlen 后面跟着 memcpy 便宜。

获得字符串操作性能的方法是尽可能避免复制;不要担心复制速度更快,而是尝试减少复制次数!这适用于所有字符串(和数组)实现,无论是char *std::stringstd::矢量,或一些自定义字符串/数组类。

Your post doesn't explain where the str*() function calls are coming from; passing around char * certainly doesn't invoke them. Identify the sites that actually do the string manipulation and then try to find out if they're doing so inefficiently. One common pitfall is that strcat first needs to scan the destination string for the terminating 0 character. If you call strcat several times in a row, you can end up with a O(N^2) algorithm, so be careful about this.

Replacing strcpy by memcpy doesn't make any significant difference; strcpy doesn't do an extra pass to find the length of the string, it's simply (conceptually!) a character-by-character copy that stops when it encounters the terminating 0. This is not much more expensive than memcpy, and always cheaper than strlen followed by memcpy.

The way to gain performance on string operations is to avoid copies where possible; don't worry about making the copying faster, instead try to copy less! And this holds for all string (and array) implementations, whether it be char *, std::string, std::vector<char>, or some custom string / array class.

如若梦似彩虹 2024-10-13 04:02:58

我觉得怎么样?我认为你应该像其他人那样痴迷于预优化。您应该找到最晦涩、难以维护、但直观(无论如何对您来说)高性能的方式,并按照这种方式去做。听起来你好像正在用你的 pair 和 malloc/memcpy 的想法来做一些事情。

无论您做什么,都不要使用预先存在的、优化的车轮,这样可以使维护更容易。当您痴迷于直观测量的性能增益时,可维护性简直是最不重要的事情。此外,正如您所知,您比那些编写编译器及其标准库实现的人要聪明得多。以至于你相信他们对任何事情的判断都是非常愚蠢的;你真的应该考虑自己重写整个事情,因为它会表现得更好。

而且......您要做的最后一件事就是使用分析器来测试您的直觉。那太科学和有条理了,我们都知道科学是一堆废话,从来没有给我们带来任何东西;我们也知道,个人直觉和启示永远不会错。当您已经直观地了解了情况的表面情况时,为什么还要浪费时间使用客观工具进行衡量呢?

请记住,我的观点是百分百诚实的。我的身体里没有讽刺的骨头。

What do I think? I think that you should do what everyone else obsessed with pre-optimization does. You should find the most obscure, unmaintainable, yet intuitively (to you anyway) high-performance way you can and do it that way. Sounds like you're onto something with your pair<char*,len> with malloc/memcpy idea there.

Whatever you do, do NOT use pre-existing, optimized wheels that make maintenence easier. Being maintainable is simply the least important thing imaginable when you're obsessed with intuitively measured performance gains. Further, as you well know, you're quite a bit smarter than those who wrote your compiler and its standard library implementation. So much so that you'd be seriously silly to trust their judgment on anything; you should really consider rewriting the entire thing yourself because it would perform better.

And ... the very LAST thing you'll want to do is use a profiler to test your intuition. That would be too scientific and methodical, and we all know that science is a bunch of bunk that's never gotten us anything; we also know that personal intuition and revelation is never, ever wrong. Why waste the time measuring with an objective tool when you've already intuitively grasped the situation's seemingliness?

Keep in mind that I'm being 100% honest in my opinion here. I don't have a sarcastic bone in my body.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文