Enumerable.Average 和 OverflowException
也许是一个无用的问题:
public static double Average<TSource>(
this IEnumerable<TSource> source,
Func<TSource, int> selector
)
上述方法抛出的异常之一也是 OverflowException:序列中元素的总和大于 Int64.MaxValue。
我认为此异常的原因是平均值是使用long
类型的变量S计算的吗?但既然返回值是double
类型,为什么设计者不选择让S也是double
类型呢?
谢谢
Perhaps a useless question:
public static double Average<TSource>(
this IEnumerable<TSource> source,
Func<TSource, int> selector
)
One of exceptions thrown by the above method is also OverflowException: The sum of the elements in the sequence is larger than Int64.MaxValue.
I assume reason for this exception is that sum of the averaged values is computed using variable S of type long
? But since return value is of type double
, why didn't designers choose to make S also of type double
?
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
因为这个特定的重载知道您从
int
值开始,所以它知道您没有使用十进制值。将每个值转换为double
,然后将double
值相加可能效率较低,并且肯定会导致浮点不精确问题的可能性,如果你有足够多的值集合。更新
我刚刚做了一个快速基准测试,平均
两倍
的时间比平均长了大约50%两倍多。代码>int。Because this particular overload knows that you're starting out with
int
values, it knows you're not using decimal values. Converting each of your values to adouble
and then adding thedouble
values together would probably be less efficient, and would definitely open you up to the possibility of floating point imprecision issues if you had a large enough collection of values.Update
I just did a quick benchmark, and it takes
roughly 50% longerover twice as long to averagedouble
s as it does to averageint
s.首先,我注意到,除非超出了 long 的范围,否则不会出现异常。你打算怎么做?每个 int 最多可以约为 20 亿,而 long 的顶部约为 80 亿,因此这意味着您必须至少取超过 40 亿个 int 的平均值才能触发异常。这是您经常需要解决的问题吗?
为了论证的目的,假设是这样。以双精度数进行数学运算会损失精度,因为双精度运算会四舍五入到大约十五位小数。观看:
在这里,我们用双重算术对两个系列进行平均:一个是 {十亿,十亿,十亿...一千万次...十亿,一,一一...九千万次},另一个是相同的顺序,首先是个数,最后是数十亿个。如果运行代码,您会得到不同的结果。不是很大的不同,而是不同,而且序列越长,差异就会变得越来越大。长算术是精确的;双算术可能会对每个计算进行四舍五入,这意味着随着时间的推移,可能会产生巨大错误。
仅对整数进行运算会导致浮点舍入误差累积,这似乎非常出乎意料。这是人们在对浮点数执行操作时所期望的事情,但在对整数执行操作时则不然。
First off, I note that the exception does not arise until you have exceeded the bounds of a long. How are you going to do that? Each int can be at most about two billion, and the top of a long is about eight billion billion, so that means that you'd have to be taking the average of more than four billion ints minimum in order to trigger the exception. Is that the sort of problem you regularly have to solve?
Suppose for the sake of argument it is. Doing the math in doubles loses precision because double arithmetic is rounded off to about fifteen decimal places. Watch:
Here we average with double arithmetic two series: one that is {a billion, a billion, a billion ... ten million times ... a billion, one, one one... ninety million times} and one that is the same sequence with the ones first and the billions last. If you run the code, you get different results. Not hugely different, but different, and the difference will become larger and larger the longer the sequences get. Long arithmetic is exact; double arithmetic potentially rounds off for every calculation and that means that massive error can accrue over time.
It seems very unexpected to do an operation solely on ints that results in an accumulation of floating point rounding error. That's the sort of thing one expected when doing an operation on floats, but not when doing it on ints.