LightFM模型:内部的得分和Sigmoid函数

发布于 2025-02-12 17:05:52 字数 2187 浏览 3 评论 0原文

我有两个与LightFM模型有关的问题:

  1. 我阅读文章关于模型,我看到它使用了sigmoid f(。)函数。我还检查了图书馆的
  2. 我是否纠正了返回的分数是q_u * p_i + b_u + b_i(请参阅文章)?如果没有,我该如何计算自己的分数?它们来自哪里,为什么它们的幅度如此之高?我的分数大约从-100000到+100000。

upd1:我遵循评论并找到了以下功能:

cdef inline flt compute_prediction_from_repr(flt *user_repr,
                                             flt *item_repr,
                                             int no_components) nogil:

    cdef int i
    cdef flt result

    # Biases
    result = user_repr[no_components] + item_repr[no_components]

    # Latent factor dot product
    for i in range(no_components):
        result += user_repr[i] * item_repr[i]

    return result

似乎分数确实是上面的公式,但是如果有人也可以看一下,这将很有帮助 - 我对Cython

Upd2:Sigmoid使用 不太好仅适用于模型的逻辑变体。如果您尝试经过翘曲,则不会使用它。

I have two questions related to the LightFM model:

  1. I read the article about the model and I see that it uses sigmoid f(.)-function. I also checked library's Cython code and I see that the function is implemented there as well. However, the model is applicable to rank items in the rating setting (rating from 1 to 5). Why isn't sigmoid harming the ranking system? I mean it returns the value from 0 to 1, why the model still works for ratings?
  2. Am I correct that the scores which model returns is q_u * p_i + b_u + b_i (see the article)? If not, how can I calculate the scores myself? Where do they come from and why their magnitude is so high? I get the scores approximately from -100000 to +100000.

UPD1: I followed the comments and found out the following function:

cdef inline flt compute_prediction_from_repr(flt *user_repr,
                                             flt *item_repr,
                                             int no_components) nogil:

    cdef int i
    cdef flt result

    # Biases
    result = user_repr[no_components] + item_repr[no_components]

    # Latent factor dot product
    for i in range(no_components):
        result += user_repr[i] * item_repr[i]

    return result

It seems like the scores are indeed the formula above, but it would be helpful if someone could also have a look - I'm not very good with Cython

UPD2: sigmoid is used only for the logistic variant of the model. It's not used if you try WARP.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

梦境 2025-02-19 17:05:52

该模型用于使用Sigmoid的评分,因为LightFM 二进制推荐问题。

对于1至5之间的评分,最高的

  • 评分为4和5表示用户对项目的兴趣 - > 从1到3的正
  • 等级表示对项目不感兴趣的用户 - >负数

是使用A AUC得分表示模型性能的原因。
对于单个用户,AUC对应于随机选择的阳性项目的排名高于随机选择的负项目的概率。

就我而言,我应用了 Warp 损失,并使用Warp得分作为向用户在功能空间中接近物品的指标,以被用户喜欢。对于概率分数或评分预测,可以考虑其他复杂模型。

The model works for ratings using Sigmoid because LightFM binarizes the recommendation problem.

For ratings between 1 to 5 with 5 being the highest,

  • ratings 4 and 5 indicate user interest in the item -> Positive
  • ratings from 1 to 3 indicate user not interested in the item -> Negative

This is the reason model performance is indicated using a AUC score.
For an individual user, AUC corresponds to the probability that a randomly chosen positive item will be ranked higher than a randomly chosen negative item.

In my case I applied the WARP Loss and use WARP score as an indicator to closeness of the item to the user in feature space to being liked by the User. For a probabilistic score or ratings prediction other sophisticated models may be considered.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文