Boost find_first 是如何工作的? / 定义一个范围

发布于 2025-01-07 11:28:13 字数 1080 浏览 0 评论 0原文

我有一个缓冲区(例如 char buffer[1024] ),其中填充了一些数据。现在我想在此缓冲区中搜索子字符串。由于它应该是不区分大小写的搜索,因此我使用 boost::algorithm::ifind_first

所以我这样调用该函数:

boost::iterator_range<char*> buf_iterator;
buf_iterator = boost::algorithm::ifind_first(buffer ,"substring");

这实际上工作得很好。但我担心的是:

我只是向函数传递了一个 char 指针,因此 ifind_first 应该不知道我的缓冲区在哪里结束,但它仍然有效。

现在我的第一个想法是该函数搜索直到字符串终止字符。但在 Boost Documentation 函数定义如下:

template<typename Range1T, typename Range2T> 
  iterator_range< typename range_iterator< Range1T >::type > 
  find_first(Range1T & Input, const Range2T & Search);

由于它适用于模板参数,我实际上怀疑它是否适用于空终止?

所以我的问题是 ifind_first 如何知道在哪里停止? 或者更准确地说,我怎样才能给它一个范围?正如已经提到的,它与 char* 一起工作得很好,但我不太确定我是否只是幸运 - 我的意思是在最坏的情况下该函数被调用并且不知道在哪里停止并进入未定义的内存...

编辑:

现在在一个答案中提到这取决于我传递给函数的类型。现在这意味着如果我使用 char 缓冲区,我必须始终确保它以 0 结尾......?

I have a buffer ( e.g. char buffer[1024] ) which gets filled with some data. Now I want to search for a substring in this buffer. Since it should be a case insenstive search I am using boost::algorithm::ifind_first.

So I call the function like this:

boost::iterator_range<char*> buf_iterator;
buf_iterator = boost::algorithm::ifind_first(buffer ,"substring");

This actually works fine. But my concern is the following:

I pass the function just a char pointer, so ifind_first should have no idea where my buffer ends, but it still works tho.

Now my first idea was that the function searches until a string-termination character. But in the Boost Documentation the function is defined like this:

template<typename Range1T, typename Range2T> 
  iterator_range< typename range_iterator< Range1T >::type > 
  find_first(Range1T & Input, const Range2T & Search);

Since it works with template parameters I actually doubt that it is working with null termination?

So my question is how does ifind_first know where to stop?
Or to be more precise, how can I give it a range? As already mentioned it works just fine with a char* but I'm not quite sure if I wasn't just lucky - I mean in the worst case the function is called and doesn't know where to stop and goes into undefined memory...

Edit:

Now in an answer there was mentioned that it depends on the type I pass to the function. Now this would mean if I work with a char buffer I have to always make sure it`s 0-terminated...?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

So要识趣 2025-01-14 11:28:13

它使用了一种技术,其中数组的长度是模板参数,即:

template< typename T, size_t L >
void foo( T (&arr)[L] )
{
}

由于字符串文字的长度已知,因此可以推导出 L ,例如 foo( "test" ) 为 foo<;字符,5>()。我敢打赌 const char* 有一个重载,假设参数是一个 c 字符串,其中 strlen() 可用于确定长度。

编辑:更好的解释演示 ifind_first 将如何失败,以及如果你小心的话为什么不会失败

在这种情况下决定 ifind_first 是否会失败的是主题或搜索是否退化为 char* 。在这种情况下,您直接传递了一个字符串文字作为搜索,ifind_first 将尝试并猜测它会得出结论:它是 const char[10](“substring”的长度 + 1 表示 NULL 终止符)。然而,对于搜索来说并不重要,因为即使它退化为 const char* ifind_first 也会猜测它是一个 NULL 终止的 c 字符串,而字符串文字是一个 NULL 终止的 c 字符串,因此可以正常工作。

在这种情况下,您实际上要求的是 char buffer[1024],在您的情况下,它不会退化为 char*。但如果相反,你可以说 char* buffer = new char[1024];缓冲区的类型是 char* 并且不保证以 NULL 终止。在这种情况下, ifind_first 将以神秘的方式失败,具体取决于您填充的区域之后的内容。

因此,总而言之,由于缓冲区的类型是 char[1024] 在你的情况下它不会触及缓冲区末尾之后的内存,但是,它也不会关心那里是否有 NULL 终止符(它看起来不对于它,当你传递给它一个 char[1024] 时,它知道编译时的长度)。因此,如果假设您用 12 个字符填充缓冲区,后跟 NULL,它仍然会搜索整个缓冲区。

It uses a technique where the length of an array is a template argument, ie:

template< typename T, size_t L >
void foo( T (&arr)[L] )
{
}

As a string literal has known length L can be deduced, such as foo( "test" ) being foo< char, 5 >(). I bet there's an overload for const char* where it's assumed that the argument is a c-string, where strlen() can be used to determine the length.

EDIT: Better explanation demonstration how ifind_first will fail, and why it won't if you're careful

What decides whether ifind_first will fail or not in this case is whether either subject or search degenerates into a char*. In this case you've passed a string literal as the search directly, ifind_first will try and guess will conclude that it's const char[ 10 ] ( length of "substring" + 1 for NULL terminator ). However, for the search it does not matter, because even if it degenerates to const char* ifind_first will guess that it's a NULL terminated c string, and a string literal is a NULL terminated c string an therefor works dandy.

In this case you're really asking for char buffer[1024], in your case it does not degenerate to char*. But if instead you would've had lets say char* buffer = new char[1024]; the type of buffer is char* and it's not guaranteed to be NULL terminated. In this case ifind_first will fail in mysterious ways depending on what's after the area you've filled.

So, to conclude, as the type of buffer is char[1024] in your case it will not touch memory past the end of buffer, BUT, it will also not care about whether there's a NULL terminator in there ( it doesn't look for it, as you've passed it a char[1024] it knows the length at compile time ). So if lets say you fill buffer with 12 characters followed by NULL it will still search the whole buffer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文