大多数签名/未签名警告的可接受修复?

发布于 2024-07-08 06:39:04 字数 2411 浏览 5 评论 0原文

我自己确信,在我正在研究的项目中,有符号整数在大多数情况下是最佳选择,即使其中包含的值永远不会是负数。 (更简单的反向 for 循环,更少的错误机会等,特别是对于只能保存 0 到 20 之间的值的整数。)

大多数出错的地方是 std 的简单迭代::vector,过去通常是一个数组,后来已更改为 std::vector。 因此,这些循环通常如下所示:

for (int i = 0; i < someVector.size(); ++i) { /* do stuff */ }

由于此模式使用得如此频繁,因此有关有符号和无符号类型之间的比较的编译器警告垃圾邮件的数量往往会隐藏更有用的警告。 请注意,我们绝对没有超过 INT_MAX 元素的向量,并请注意,到目前为止,我们使用两种方法来修复编译器警告:

for (unsigned i = 0; i < someVector.size(); ++i) { /*do stuff*/ }

这通常有效,但如果循环包含任何类似 'if (i-1 > > ') 的代码,则可能会默默地中断。 = 0) ...' 等。

for (int i = 0; i < static_cast<int>(someVector.size()); ++i) { /*do stuff*/ }

此更改没有任何副作用,但确实使循环的可读性降低了很多。 (而且需要更多的打字。)

所以我想到了以下想法:

template <typename T> struct vector : public std::vector<T>
{
    typedef std::vector<T> base;

    int size() const     { return base::size(); }
    int max_size() const { return base::max_size(); }
    int capacity() const { return base::capacity(); }

    vector()                  : base() {}
    vector(int n)             : base(n) {}
    vector(int n, const T& t) : base(n, t) {}
    vector(const base& other) : base(other) {}
};

template <typename Key, typename Data> struct map : public std::map<Key, Data>
{
    typedef std::map<Key, Data> base;
    typedef typename base::key_compare key_compare;

    int size() const     { return base::size(); }
    int max_size() const { return base::max_size(); }

    int erase(const Key& k) { return base::erase(k); }
    int count(const Key& k) { return base::count(k); }

    map() : base() {}
    map(const key_compare& comp) : base(comp) {}
    template <class InputIterator> map(InputIterator f, InputIterator l) : base(f, l) {}
    template <class InputIterator> map(InputIterator f, InputIterator l, const key_compare& comp) : base(f, l, comp) {}
    map(const base& other) : base(other) {}
};

// TODO: similar code for other container types

您所看到的基本上是 STL 类,其中的方法返回 size_type 被重写以仅返回“int”。 需要构造函数,因为它们不是继承的。

如果您在现有代码库中看到这样的解决方案,作为开发人员您会怎么看?

您是否会认为“哇,他们正在重新定义 STL,这真是太棒了!” ,或者您是否认为这是一个很好的简单解决方案,可以防止错误并提高可读性。 或者也许您更愿意看到我们花了(半)天左右的时间来更改所有这些循环以使用 std::vector<>::iterator?

(特别是如果该解决方案与禁止对除原始数据(例如无符号字符)和位掩码之外的任何内容使用无符号类型相结合。)

I myself am convinced that in a project I'm working on signed integers are the best choice in the majority of cases, even though the value contained within can never be negative. (Simpler reverse for loops, less chance for bugs, etc., in particular for integers which can only hold values between 0 and, say, 20, anyway.)

The majority of the places where this goes wrong is a simple iteration of a std::vector, often this used to be an array in the past and has been changed to a std::vector later. So these loops generally look like this:

for (int i = 0; i < someVector.size(); ++i) { /* do stuff */ }

Because this pattern is used so often, the amount of compiler warning spam about this comparison between signed and unsigned type tends to hide more useful warnings. Note that we definitely do not have vectors with more then INT_MAX elements, and note that until now we used two ways to fix compiler warning:

for (unsigned i = 0; i < someVector.size(); ++i) { /*do stuff*/ }

This usually works but might silently break if the loop contains any code like 'if (i-1 >= 0) ...', etc.

for (int i = 0; i < static_cast<int>(someVector.size()); ++i) { /*do stuff*/ }

This change does not have any side effects, but it does make the loop a lot less readable. (And it's more typing.)

So I came up with the following idea:

template <typename T> struct vector : public std::vector<T>
{
    typedef std::vector<T> base;

    int size() const     { return base::size(); }
    int max_size() const { return base::max_size(); }
    int capacity() const { return base::capacity(); }

    vector()                  : base() {}
    vector(int n)             : base(n) {}
    vector(int n, const T& t) : base(n, t) {}
    vector(const base& other) : base(other) {}
};

template <typename Key, typename Data> struct map : public std::map<Key, Data>
{
    typedef std::map<Key, Data> base;
    typedef typename base::key_compare key_compare;

    int size() const     { return base::size(); }
    int max_size() const { return base::max_size(); }

    int erase(const Key& k) { return base::erase(k); }
    int count(const Key& k) { return base::count(k); }

    map() : base() {}
    map(const key_compare& comp) : base(comp) {}
    template <class InputIterator> map(InputIterator f, InputIterator l) : base(f, l) {}
    template <class InputIterator> map(InputIterator f, InputIterator l, const key_compare& comp) : base(f, l, comp) {}
    map(const base& other) : base(other) {}
};

// TODO: similar code for other container types

What you see is basically the STL classes with the methods which return size_type overridden to return just 'int'. The constructors are needed because these aren't inherited.

What would you think of this as a developer, if you'd see a solution like this in an existing codebase?

Would you think 'whaa, they're redefining the STL, what a huge WTF!', or would you think this is a nice simple solution to prevent bugs and increase readability. Or maybe you'd rather see we had spent (half) a day or so on changing all these loops to use std::vector<>::iterator?

(In particular if this solution was combined with banning the use of unsigned types for anything but raw data (e.g. unsigned char) and bit masks.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

扬花落满肩 2024-07-15 06:39:04

不要从 STL 容器公开派生。 它们具有非虚拟析构函数,如果有人通过指向基的指针删除您的对象之一,则会调用未定义的行为。 如果您必须从向量中派生,请私下进行,并使用 using 声明公开您需要公开的部分。

在这里,我只使用 size_t 作为循环变量。 它简单易读。 评论说使用 int 索引会将您暴露为 n00b 的发帖者是正确的。 然而,使用迭代器循环向量会让你成为一个稍微更有经验的 n00b - 一个没有意识到向量的下标运算符是常数时间的人。 (vector::size_type 是准确的,但在我看来是不必要的冗长)。

Don't derive publicly from STL containers. They have nonvirtual destructors which invokes undefined behaviour if anyone deletes one of your objects through a pointer-to base. If you must derive e.g. from a vector, do it privately and expose the parts you need to expose with using declarations.

Here, I'd just use a size_t as the loop variable. It's simple and readable. The poster who commented that using an int index exposes you as a n00b is correct. However, using an iterator to loop over a vector exposes you as a slightly more experienced n00b - one who doesn't realize that the subscript operator for vector is constant time. (vector<T>::size_type is accurate, but needlessly verbose IMO).

梦回旧景 2024-07-15 06:39:04

虽然我不认为“使用迭代器,否则你看起来n00b”是解决问题的好方法,但从 std::vector 派生似乎比这更糟糕。

首先,开发人员确实希望向量是 std:.vector,而映射是 std::map。 其次,您的解决方案无法针对其他容器或与容器交互的其他类/库进行扩展。

是的,迭代器很丑陋,迭代器循环的可读性不是很好,而 typedef 只能掩盖混乱。 但至少,它们确实可以扩展,而且它们是规范的解决方案。

我的解决方案? 一个 stl-for-each 宏。 这并非没有问题(主要是,它是一个宏,恶心),但它传达了含义。 它不像这个那样先进,但也能完成工作。

While I don't think "use iterators, otherwise you look n00b" is a good solution to the problem, deriving from std::vector appears much worse than that.

First, developers do expect vector to be std:.vector, and map to be std::map. Second, your solution does not scale for other containers, or for other classes/libraries that interact with containers.

Yes, iterators are ugly, iterator loops are not very well readable, and typedefs only cover up the mess. But at least, they do scale, and they are the canonical solution.

My solution? an stl-for-each macro. That is not without problems (mainly, it is a macro, yuck), but it gets across the meaning. It is not as advanced as e.g. this one, but does the job.

尽揽少女心 2024-07-15 06:39:04

我制作了这个社区维基...请编辑它。 我不再同意反对“int”的建议。 我现在认为还不错。

是的,我同意理查德的观点。 您永远不应该在类似的循环中使用 'int' 作为计数变量。 以下是您可能希望如何使用索引执行各种循环(尽管没有什么理由这样做,但有时这可能很有用)。

向前

for(std::vector<int>::size_type i = 0; i < someVector.size(); i++) {
    /* ... */
}

向后

你可以这样做,这是完美定义的行为:

for(std::vector<int>::size_type i = someVector.size() - 1; 
    i != (std::vector<int>::size_type) -1; i--) {
    /* ... */
}

很快,随着 c++1x(下一个 C++ 版本)的顺利进行,你可以这样做:

for(auto i = someVector.size() - 1; i != (decltype(i)) -1; i--) {
    /* ... */
}

递减到 0 以下将导致 i 回绕,因为它是无符号的。

但是 unsigned 会让错误出现。

这永远不应该成为使其错误的方式的论点(使用 'int')。

为什么不使用上面的 std::size_t ?

C++ 标准在 23.1 p5 容器要求 中定义了 T::size_type ,因为 T 是某个 Container,该类型是某种实现定义的无符号整数类型。 现在,对上面的 i 使用 std::size_t 会让 bug 悄然进入。 如果T::size_type小于或大于std::size_t,那么它将溢出i,或者甚至达不到>(std::size_t)-1 如果 someVector.size() == 0。 同样,循环的条件也将被完全破坏。

I made this community wiki... Please edit it. I don't agree with the advice against "int" anymore. I now see it as not bad.

Yes, i agree with Richard. You should never use 'int' as the counting variable in a loop like those. The following is how you might want to do various loops using indices (althought there is little reason to, occasionally this can be useful).

Forward

for(std::vector<int>::size_type i = 0; i < someVector.size(); i++) {
    /* ... */
}

Backward

You can do this, which is perfectly defined behaivor:

for(std::vector<int>::size_type i = someVector.size() - 1; 
    i != (std::vector<int>::size_type) -1; i--) {
    /* ... */
}

Soon, with c++1x (next C++ version) coming along nicely, you can do it like this:

for(auto i = someVector.size() - 1; i != (decltype(i)) -1; i--) {
    /* ... */
}

Decrementing below 0 will cause i to wrap around, because it is unsigned.

But unsigned will make bugs slurp in

That should never be an argument to make it the wrong way (using 'int').

Why not use std::size_t above?

The C++ Standard defines in 23.1 p5 Container Requirements, that T::size_type , for T being some Container, that this type is some implementation defined unsigned integral type. Now, using std::size_t for i above will let bugs slurp in silently. If T::size_type is less or greater than std::size_t, then it will overflow i, or not even get up to (std::size_t)-1 if someVector.size() == 0. Likewise, the condition of the loop would have been broken completely.

相权↑美人 2024-07-15 06:39:04

一定要使用迭代器。 很快您将能够使用“自动”类型,以获得更好的可读性(您关心的问题之一),如下所示:

for (auto i = someVector.begin();
     i != someVector.end();
     ++i)

Definitely use an iterator. Soon you will be able to use the 'auto' type, for better readability (one of your concerns) like this:

for (auto i = someVector.begin();
     i != someVector.end();
     ++i)
手心的海 2024-07-15 06:39:04

跳过索引

最简单的方法是通过使用迭代器、基于范围的 for 循环或算法来回避问题:

for (auto it = begin(v); it != end(v); ++it) { ... }
for (const auto &x : v) { ... }
std::for_each(v.begin(), v.end(), ...);

如果您实际上不需要索引值,这是一个很好的解决方案。 它还可以轻松处理反向循环。

使用适当的无符号类型

另一种方法是使用容器的大小类型。

for (std::vector<T>::size_type i = 0; i < v.size(); ++i) { ... }

您还可以使用 std::size_t (来自)。 有些人(正确地)指出 std::size_t 可能与 std::vector::size_type 不同(尽管它通常是)。 不过,您可以放心,容器的 size_type 将适合 std::size_t。 所以一切都很好,除非您使用某些样式进行反向循环。 我首选的反向循环样式是这样的:

for (std::size_t i = v.size(); i-- > 0; ) { ... }

使用这种样式,您可以安全地使用 std::size_t,即使它的类型比 std::vector::size_type 更大。 其他一些答案中显示的反向循环样式需要将 -1 转换为正确的类型,因此不能使用更容易输入的 std::size_t

使用签名类型(小心!)

如果您确实想使用签名类型(或者您的 样式指南实际上需要一个),例如 int,然后您可以使用这个微小的函数模板来检查调试版本中的基本假设并使转换显式化这样您就不会收到编译器警告消息:

#include <cassert>
#include <cstddef>
#include <limits>

template <typename ContainerType>
constexpr int size_as_int(const ContainerType &c) {
    const auto size = c.size();  // if no auto, use `typename ContainerType::size_type`
    assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
    return static_cast<int>(size);
}

现在您可以编写:

for (int i = 0; i < size_as_int(v); ++i) { ... }

或以传统方式反向循环:

for (int i = size_as_int(v) - 1; i >= 0; --i) { ... }

size_as_int 技巧仅比隐式转换的循环稍微多一点,您会得到在运行时检查的基本假设,您可以使用显式强制转换使编译器警告静音,您将获得与非调试构建相同的速度,因为它几乎肯定会被内联,并且优化的目标代码不应该更大,因为模板不会不执行编译器尚未隐式执行的任何操作。

Skip the index

The easiest approach is to sidestep the problem by using iterators, range-based for loops, or algorithms:

for (auto it = begin(v); it != end(v); ++it) { ... }
for (const auto &x : v) { ... }
std::for_each(v.begin(), v.end(), ...);

This is a nice solution if you don't actually need the index value. It also handles reverse loops easily.

Use an appropriate unsigned type

Another approach is to use the container's size type.

for (std::vector<T>::size_type i = 0; i < v.size(); ++i) { ... }

You can also use std::size_t (from <cstddef>). There are those who (correctly) point out that std::size_t may not be the same type as std::vector<T>::size_type (though it usually is). You can, however, be assured that the container's size_type will fit in a std::size_t. So everything is fine, unless you use certain styles for reverse loops. My preferred style for a reverse loop is this:

for (std::size_t i = v.size(); i-- > 0; ) { ... }

With this style, you can safely use std::size_t, even if it's a larger type than std::vector<T>::size_type. The style of reverse loops shown in some of the other answers require casting a -1 to exactly the right type and thus cannot use the easier-to-type std::size_t.

Use a signed type (carefully!)

If you really want to use a signed type (or if your style guide practically demands one), like int, then you can use this tiny function template that checks the underlying assumption in debug builds and makes the conversion explicit so that you don't get the compiler warning message:

#include <cassert>
#include <cstddef>
#include <limits>

template <typename ContainerType>
constexpr int size_as_int(const ContainerType &c) {
    const auto size = c.size();  // if no auto, use `typename ContainerType::size_type`
    assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
    return static_cast<int>(size);
}

Now you can write:

for (int i = 0; i < size_as_int(v); ++i) { ... }

Or reverse loops in the traditional manner:

for (int i = size_as_int(v) - 1; i >= 0; --i) { ... }

The size_as_int trick is only slightly more typing than the loops with the implicit conversions, you get the underlying assumption checked at runtime, you silence the compiler warning with the explicit cast, you get the same speed as non-debug builds because it will almost certainly be inlined, and the optimized object code shouldn't be any larger because the template doesn't do anything the compiler wasn't already doing implicitly.

放我走吧 2024-07-15 06:39:04

你想太多了这个问题。

使用 size_t 变量是更好的选择,但如果您不相信程序员正确使用 unsigned,请使用强制转换并仅处理丑陋的问题。 找个实习生把它们全部改掉,之后就不用担心了。 打开警告作为错误,并且不会出现新的错误。您的循环现在可能很“丑陋”,但您可以将其理解为您对签名与未签名的宗教立场的后果。

You're overthinking the problem.

Using a size_t variable is preferable, but if you don't trust your programmers to use unsigned correctly, go with the cast and just deal with the ugliness. Get an intern to change them all and don't worry about it after that. Turn on warnings as errors and no new ones will creep in. Your loops may be "ugly" now, but you can understand that as the consequences of your religious stance on signed versus unsigned.

早乙女 2024-07-15 06:39:04

vector.size() 返回一个 size_t var,因此只需将 int 更改为 size_t 就可以了。

理查德的答案更正确,只是对于一个简单的循环来说需要做很多工作。

vector.size() returns a size_t var, so just change int to size_t and it should be fine.

Richard's answer is more correct, except that it's a lot of work for a simple loop.

亢潮 2024-07-15 06:39:04

我注意到人们对这个话题有非常不同的看法。 我也有一个不能说服别人的观点,所以寻求一些大师的支持是有意义的,我找到了 CPP 核心指南:

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines

由 Bjarne Stroustrup 和 Herb Sutter 维护,他们的最后更新(我基于以下信息)是 4 月 10 日, 2022.

请看一下以下代码规则:

  • ES.100:不要混合有符号和无符号算术
  • ES.101:使用无符号类型进行位操作
  • ES.102:使用有符号类型进行算术
  • ES.107:不要使用无符号 for 下标,更喜欢 gsl::index

因此,假设我们想要在 for 循环中索引,并且由于某种原因基于范围的 for 循环不是合适的解决方案,那么使用无符号类型也是不 首选解决方案。 建议的解决方案是使用 gsl::index。

但如果你身边没有 gsl 并且又不想引入它,那怎么办呢?

在这种情况下,我建议使用 Adrian McCarthy 建议的实用模板函数:size_as_int

I notice that people have very different opinions about this subject. I have also an opinion which does not convince others, so it makes sense to search for support by some guru’s, and I found the CPP core guidelines:

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines

maintained by Bjarne Stroustrup and Herb Sutter, and their last update, upon which I base the information below, is of April 10, 2022.

Please take a look at the following code rules:

  • ES.100: Don’t mix signed and unsigned arithmetic
  • ES.101: Use unsigned types for bit manipulation
  • ES.102: Use signed types for arithmetic
  • ES.107: Don’t use unsigned for subscripts, prefer gsl::index

So, supposing that we want to index in a for loop and for some reason the range based for loop is not the appropriate solution, then using an unsigned type is also not the preferred solution. The suggested solution is using gsl::index.

But in case you don’t have gsl around and you don’t want to introduce it, what then?

In that case I would suggest to have a utility template function as suggested by Adrian McCarthy: size_as_int

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文