计算准确性和显示有意义结果的最佳方法
我当前的方法允许我确定最准确的数组,但我无法找到显示信息结果的好方法。
这是我的情况......
我将X数量的整数数组与静态整数数组进行比较。对于数组中的每个位置,我通过与静态数组中的等效位置进行比较来计算位置的准确性结果。确定阵列的最后位置精度结果后,我存储该阵列的所有精度结果的总和,以便稍后进行比较。
一旦每个数组的所有准确度结果的总和被保存,它们就会被相互比较。 总和最小的数组被视为最准确。
伪代码……
foreach (ComparableArray as SingleArray) {
for (i = 0; i < count(SingleArray); i++) {
AccuracyResults[SingleArray] += |StaticArray[i] - SingleArray[i]| / CONSTANT;
}
}
BestArray = AscendingSort(AccuracyResults)[0];
准确度是通过取 SingleArray 值与 StaticArray 之差的绝对值并除以某个常数来确定的。如果准确度结果 1,则认为结果是准确的。如果结果> 1,则不准确,结果 = 0 是完美的。
这是一个场景...为了简单起见,我们使用两个数组
S = [ 56, 53, 50, 64 ]
A = [ 56, 54, 52, 64 ]
B = [ 54, 52, 51, 63 ]
循环遍历每个数组与A。
比较 A(56) 和 S(56) 的位置 [1] 的准确性。 确定精度(我将使用两个作为常数) |56-56|=0, 0 / 2 = 0;完美的精度
继续比较每个位置并计算精度 |53-54|=1, 1 / 2 = 0.5;准确,因为 <= 1
|50-52|=2, 2 / 2 = 1;准确
|64-64| = 0; Perfect
现在计算数组 A 的所有准确结果的总和 0 + 0.5 + 1 + 0 = 1.5
如果我们对数组 B 进行相同的操作,最终结果将是 1 + 0.5 + 0.5 + 0.5 = 2.5
现在,如果我们比较数组 A 和 B,我们可以看到数组 A 比 B 更准确,因为 总和较低。
问题是,当尝试显示 A 比 B 准确多少时,1.5 和 2.5 意义不大。
显示这些结果的最佳方法是什么? 我考虑过显示百分比……例如 A 比 B 好 17%。或者 BestArray 比平均水平好 6%。
我将如何计算这些结果?
您是否发现我的计算准确性方式存在任何逻辑问题或知道更好的方法?
感谢您提供的任何见解!
My current method allows me to determine the most accurate array but I cannot figure out a good way to display informative results.
Here’s my situation …
I compare X amount of integer arrays to a static integer array. For each position in the array I calculate the position’s accuracy result by comparing to the equivalent position in the static array. After the array’s last position accuracy result has been determined I store the sum of all accuracy results for that array for comparison at a later time.
Once each array’s sum of all accuracy results has been saved they are compared to one another. The array with the lowest sum is deemed the most accurate.
Pseudo code …
foreach (ComparableArray as SingleArray) {
for (i = 0; i < count(SingleArray); i++) {
AccuracyResults[SingleArray] += |StaticArray[i] - SingleArray[i]| / CONSTANT;
}
}
BestArray = AscendingSort(AccuracyResults)[0];
Accuracy is determined by taking the absolute value of the difference of the SingleArray value from the StaticArray and dividing by some constant. If accuracy result is < 1, then the result is deemed accurate. If result > 1, then it is inaccurate and results = 0 are perfect.
Here's a scenario ... let's use two arrays for simplicity
S = [ 56, 53, 50, 64 ]
A = [ 56, 54, 52, 64 ]
B = [ 54, 52, 51, 63 ]
Looping through each array starting with A.
Compare position [1] of A(56) and S(56) for accuracy.
Determine accuracy (I'll use two for my constant)
|56-56|=0, 0 / 2 = 0; Perfect accuracy
Continue to compare each position and compute accuracy
|53-54|=1, 1 / 2 = 0.5; Accuracte because <= 1
|50-52|=2, 2 / 2 = 1; Accurate
|64-64| = 0; Perfect
Now compute the sum of all accuray results for array A
0 + 0.5 + 1 + 0 = 1.5
If we do the same operations for array B the final result will be
1 + 0.5 + 0.5 + 0.5 = 2.5
Now if we compare array A to B we can see that array A is more accurate than B because the sum is lower.
The problem is 1.5 and 2.5 are not very meaningful when trying to display how much more accurate A is to B.
What would be the best method to display these results?
I thought about displaying percentages … such as A is 17% better than B. Or the BestArray is 6% better than average.
How would I compute those results?
Do you see any logic problems in my way of computing accuracy or know of a better way?
Thanks for any insight you can provide!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我倾向于同意@Martin 的观点,即使用数值来量化定性测量之间的差异有点狡猾。然而,人们总是这样做,所以如果你想继续这样做,就继续吧!
现在,我真正想写的是你的伪代码根本不是非常伪。这是我要编写的伪代码:
它指定与您的版本相同的计算。现在,您可能会也可能不会认识到这是一个有效的 Mathematica 语句,但这不是重点。关键是您已经发现了用于测量两个向量之间的距离的无数函数之一。其他距离度量包括欧几里德距离和棋盘距离。
您还可以使用多种矢量范数中的任何一种来测量矢量之间的距离。例如,Mathematica 给出了计算结果 sqrt(5):
因此,如果您确实想沉迷于一些狡猾的伪统计数据,请谷歌搜索向量距离和范数的一些定义。我想您也会找到代码或至少是命令式算法。
问候
马克
PS 不要告诉任何人我用伪科学帮助了你:-)
I tend to agree with @Martin that using numerical values to quantify the difference between qualitative measurements is a bit dodgy. However, people do it all the time, so if you want to carry on doing it go right ahead !
Now, what I really wanted to write is that your pseudo-code is not terribly pseudo- at all. Here's the pseudo-code that I would write:
which specifies the same calculation as your version. Now, you may or may not recognise this to be a valid Mathematica statement, but that's beside the point. The point is that you have hit upon one of a myriad functions for measuring the distance between two vectors. Other distance measures include the Euclidean distance, and the Chessboard distance.
You could also use any one of a number of vector norms for measuring the distance between your vectors. For example, Mathematica gives the result sqrt(5) for the calculation:
So, if you do want to indulge in some dodgy pseudo-statistics Google around for some definitions of vector distances and norms. I guess you'll find code or at least imperative algorithms too.
Regards
Mark
PS Don't tell anyone I helped you with pseudo-science :-)
相对百分比是一个坏主意,因为人们非常不善于判断这在实践中意味着什么 - 要获得更多解释,请参阅《糟糕的科学》一书。
只需按从最准确到最不准确的顺序显示总和并解释评级系统即可。我不认为把它们变成任何形式的百分比是有帮助的,但最好给出一些指导数字或条带(比如通过给文本或背景着色)来说明什么是好的、中等的和差的准确性。
最后,您的问题对于您的编程程序来说非常具体,并且按照其措辞方式不太可能对许多其他人有用。在这里,我们更希望问题在技术主题中具体化,但通常适用于其他问题,因此,如果您下次更笼统地表达您的问题,则会获得更好的资源。
Relative percentages are a bad idea, because people are very bad at judging what that means in practice - for more explanation, see the book Bad Science.
Just display the sums in order from most accurate to least and explain the rating system. I don't think turning them into any sort of percentage is helpful, but it would be a good idea to give some guide figures or banding (say by colouring the text or background) of what good, middling and poor accuracy would be.
Finally, your question is very specific to your programming program and is unlikely to be of use to many other people the way it is phrased. Here we prefer question to be specific in technical topic but generally applicable to other problems, so if you phrase your problems more generally next time it makes for a better resource.
您的“位置精度”只是一个误差,如果呈正态分布(正如人们所期望的那样),则可以用高斯分布进行建模。如果是这样,由于高斯随机变量的总和本身就是高斯分布,因此您的“所有精度之和”数字也是高斯分布随机变量。您可以计算这些误差和的均值和方差,并使用高斯 PDF(概率分布函数)对您的系统进行建模,并用它来回答诸如“最后一个笨重向量应该是亮红色,因为它的误差和大于 95%”之类的问题所有此类载体”。或者“哇,最后一个向量是 A+,因为它的误差小于所有其他此类向量的 1%”。
这篇 wiki 帖子也可能有帮助。
保罗
Your "position accuracy" is just an error which if normally distributed (as one would expect) can be modeled with a gaussian distribution. If so, since sums of gaussian random variables are themselves gaussian, your "sum of all accuracy" number is also a gaussian distributed random variable. You can compute a mean and variance of these error sums and have a gaussian PDF (probability distribution function) modeling your system and use it to answer questions like "that last clunky vector should be bright red because it had an error sum larger than 95% of all such vectors". Or "wow that last vector was A+ because it had an error less than 1% of all other such vectors".
This wiki post may help too.
Paul
均方误差通常在工程界用于量化解与解的估计之间的误差。
为了避免值出现较大差异的问题,请考虑使用 log(error) ...当然,这有其自身的问题,log(0) 为 -无穷大,并且 if (0 < error < 1) log 给出负数
Mean Squared Error is often used in engineering circles to quantify error between a solution and an estimate of the solution.
To avoid problems with a large variance in the values consider using log(error) ...of course this has it's own issues with log(0) being -infinity and if (0 < error < 1) log gives negative numbers