LightFM模型:内部的得分和Sigmoid函数
我有两个与LightFM模型有关的问题:
- 我阅读文章关于模型,我看到它使用了sigmoid f(。)函数。我还检查了图书馆的
- 我是否纠正了返回的分数是q_u * p_i + b_u + b_i(请参阅文章)?如果没有,我该如何计算自己的分数?它们来自哪里,为什么它们的幅度如此之高?我的分数大约从-100000到+100000。
upd1:我遵循评论并找到了以下功能:
cdef inline flt compute_prediction_from_repr(flt *user_repr,
flt *item_repr,
int no_components) nogil:
cdef int i
cdef flt result
# Biases
result = user_repr[no_components] + item_repr[no_components]
# Latent factor dot product
for i in range(no_components):
result += user_repr[i] * item_repr[i]
return result
似乎分数确实是上面的公式,但是如果有人也可以看一下,这将很有帮助 - 我对Cython
Upd2:Sigmoid使用 不太好仅适用于模型的逻辑变体。如果您尝试经过翘曲,则不会使用它。
I have two questions related to the LightFM model:
- I read the article about the model and I see that it uses sigmoid f(.)-function. I also checked library's Cython code and I see that the function is implemented there as well. However, the model is applicable to rank items in the rating setting (rating from 1 to 5). Why isn't sigmoid harming the ranking system? I mean it returns the value from 0 to 1, why the model still works for ratings?
- Am I correct that the scores which model returns is q_u * p_i + b_u + b_i (see the article)? If not, how can I calculate the scores myself? Where do they come from and why their magnitude is so high? I get the scores approximately from -100000 to +100000.
UPD1: I followed the comments and found out the following function:
cdef inline flt compute_prediction_from_repr(flt *user_repr,
flt *item_repr,
int no_components) nogil:
cdef int i
cdef flt result
# Biases
result = user_repr[no_components] + item_repr[no_components]
# Latent factor dot product
for i in range(no_components):
result += user_repr[i] * item_repr[i]
return result
It seems like the scores are indeed the formula above, but it would be helpful if someone could also have a look - I'm not very good with Cython
UPD2: sigmoid is used only for the logistic variant of the model. It's not used if you try WARP.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
该模型用于使用Sigmoid的评分,因为LightFM 二进制推荐问题。
对于1至5之间的评分,最高的
是使用A AUC得分表示模型性能的原因。
对于单个用户,AUC对应于随机选择的阳性项目的排名高于随机选择的负项目的概率。
就我而言,我应用了 Warp 损失,并使用Warp得分作为向用户在功能空间中接近物品的指标,以被用户喜欢。对于概率分数或评分预测,可以考虑其他复杂模型。
The model works for ratings using Sigmoid because LightFM binarizes the recommendation problem.
For ratings between 1 to 5 with 5 being the highest,
This is the reason model performance is indicated using a AUC score.
For an individual user, AUC corresponds to the probability that a randomly chosen positive item will be ranked higher than a randomly chosen negative item.
In my case I applied the WARP Loss and use WARP score as an indicator to closeness of the item to the user in feature space to being liked by the User. For a probabilistic score or ratings prediction other sophisticated models may be considered.