累积累计收益DCG_SCORE SKLEARN

发布于 2025-02-11 20:59:16 字数 611 浏览 1 评论 0原文

from sklearn.metrics import ndcg_score, dcg_score
import numpy as np
  
actual= [3,2,0,0,1]
ideal= sorted(actual, reverse=True)

#list to np asarray
actualarr=np.asarray([actual])
idealarr= np.asarray([ideal])
print ("actual score as array", actualarr)
print("ideal score as array", idealarr)

#Discounted Cumulative Gain
dcg= dcg_score(idealarr, actualarr)
print("DCG: ", dcg)

我不明白为什么dcg_score以y_score为参数。当我奏效DCG Longhand(sum Residenta/log2(i+1))时,我可以得到相同的答案〜4.6,但是我只需使用True Scores [3,2,0,0,0,0]就可以实现这一目标。,为什么它还需要理想得分[3,2,1,0,0]在功能中?

from sklearn.metrics import ndcg_score, dcg_score
import numpy as np
  
actual= [3,2,0,0,1]
ideal= sorted(actual, reverse=True)

#list to np asarray
actualarr=np.asarray([actual])
idealarr= np.asarray([ideal])
print ("actual score as array", actualarr)
print("ideal score as array", idealarr)

#Discounted Cumulative Gain
dcg= dcg_score(idealarr, actualarr)
print("DCG: ", dcg)

I don't understand why dcg_score takes y_score as a parameter. When I work out DCG longhand (sum relevance/log2(i+1)) I can get the same answer ~4.6, but i can achieve this just with the true scores [3,2,0,0,1], so why does it also require the ideal score [3,2,1,0,0] in the function?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

鸢与 2025-02-18 20:59:16

我知道sklearn.metrics.ndcg通过从y_true中取值就可以计算其总和,就像根据y_score重新排序。

:“总和在预测分数引起的顺序中排名的真​​实分数”

这意味着使用真实相关值在诱导的排名上计算公制。

一个小示例:

import numpy as np
from sklearn.metrics import dcg_score

def naive_dcg(y_score):
    score = 0
    for i,n in enumerate(y_score[0]):
        num = 2**n -1
        den = np.log2(i+1+1)
        score += num/den
    return score

y_true = [[1,0]]
y_score = [[0,1]]

print(f'sklearn: {dcg_score(y_true,y_score):.2}, naive: {naive_dcg(y_score):.2}')

y_score = [[0.1,0.2]]

print(f'sklearn: {dcg_score(y_true,y_score):.2}, naive: {naive_dcg(y_score):.2}')

输出:

sklearn: 0.63, naive: 0.63
sklearn: 0.63, naive: 0.17

显示幼稚的对同一排名顺序产生不同的度量。

I understood that sklearn.metrics.ndcg computes its sum by taking values from y_true as if it was reordered according to y_score.

As explained inside the code: "Sum the true scores ranked in the order induced by the predicted scores"

This means the metric is computed on the induced ranking, using true relevance values.

A small example:

import numpy as np
from sklearn.metrics import dcg_score

def naive_dcg(y_score):
    score = 0
    for i,n in enumerate(y_score[0]):
        num = 2**n -1
        den = np.log2(i+1+1)
        score += num/den
    return score

y_true = [[1,0]]
y_score = [[0,1]]

print(f'sklearn: {dcg_score(y_true,y_score):.2}, naive: {naive_dcg(y_score):.2}')

y_score = [[0.1,0.2]]

print(f'sklearn: {dcg_score(y_true,y_score):.2}, naive: {naive_dcg(y_score):.2}')

outputs:

sklearn: 0.63, naive: 0.63
sklearn: 0.63, naive: 0.17

which shows naive produces a different metric for the same ranking order.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文