余弦相似度代码(非项向量)
我试图找到 2 个向量(x,y 点)之间的余弦相似度,但我犯了一些我无法确定的愚蠢错误。请原谅我是新手,如果我犯了一个非常简单的错误(我很可能是这样),请原谅我。
感谢您的帮助
public static double GetCosineSimilarity(List<Point> V1, List<Point> V2)
{
double sim = 0.0d;
int N = 0;
N = ((V2.Count < V1.Count)?V2.Count : V1.Count);
double dotX = 0.0d; double dotY = 0.0d;
double magX = 0.0d; double magY = 0.0d;
for (int n = 0; n < N; n++)
{
dotX += V1[n].X * V2[n].X;
dotY += V1[n].Y * V2[n].Y;
magX += Math.Pow(V1[n].X, 2);
magY += Math.Pow(V1[n].Y, 2);
}
return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY));
}
编辑:除了语法之外,我的问题还与逻辑构造有关,因为我正在处理不同长度的向量。另外,上面的内容如何推广到 m 维的向量。谢谢
I am trying to find the cosine similarity between 2 vectors (x,y Points) and I am making some silly error that I cannot nail down. Pardone me am a newbie and sorry if I am making a very simple error (which I very likely am).
Thanks for your help
public static double GetCosineSimilarity(List<Point> V1, List<Point> V2)
{
double sim = 0.0d;
int N = 0;
N = ((V2.Count < V1.Count)?V2.Count : V1.Count);
double dotX = 0.0d; double dotY = 0.0d;
double magX = 0.0d; double magY = 0.0d;
for (int n = 0; n < N; n++)
{
dotX += V1[n].X * V2[n].X;
dotY += V1[n].Y * V2[n].Y;
magX += Math.Pow(V1[n].X, 2);
magY += Math.Pow(V1[n].Y, 2);
}
return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY));
}
Edit: Apart from syntax, my question was also to do with the logical construct given I am dealing with Vectors of differing lengths. Also, how is the above generalizable to vectors of m dimensions. Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果是二维的,则可以将向量表示为
(V1.X, V1.Y)
和(V2.X, V2.Y)
,然后 如果您处于更高的维度,那么您可以将每个向量表示为
List
。因此,在 4 维中,第一个向量将具有分量V1 = (V1[0], V1[1], V1[2], V1[3])
。If you are in 2-dimensions, then you can have vectors represented as
(V1.X, V1.Y)
and(V2.X, V2.Y)
, then useIf you are in higher dimensions then you can represent each vector as
List<double>
. So, in 4-dimensions the first vector would have componentsV1 = (V1[0], V1[1], V1[2], V1[3])
.更新日期:2023 年 12 月 30 日
正如 Bellarmine Head 所指出的,最新版本的 Microsoft.SemanticKernel.Core Nuget 包(版本 1.0.1)不再不再具有 CosineSimilarity 函数,但是您可以使用System.Numerics.Tensors Nuget 包中的 Microsoft 静态方法TensorPrimitives。取而代之的是余弦相似度。它接受两个只读跨度(ReadOnlySpan x、ReadOnlySpan y)。
使用浮点数组时,用法看起来像这样:
结果越接近 1,两个项目就越相似。
较早的帖子
Microsoft 在 Microsoft.SemanticKernel.Core Nuget 包(当前版本为 1.0.0-beta1)中有一个扩展方法,其扩展方法名为 余弦相似度。它具有三个重载:
使用浮点数组时,用法如下所示:
结果越接近 1,两项就越相似。
Update Dec 30, 2023
As pointed out by Bellarmine Head, the latest version of Microsoft.SemanticKernel.Core Nuget package (version 1.0.1) does NOT have a CosineSimilarity function anymore, but you can use the Microsoft static method in the System.Numerics.Tensors Nuget package TensorPrimitives.CosineSimilarity instead. It accepts two read only spans (ReadOnlySpan x, ReadOnlySpan y).
Using float arrays the usage looks something like this:
The closer the result is to 1 the more similar the two items are.
Older post
Microsoft has a extension method in the Microsoft.SemanticKernel.Core Nuget package (currently at version 1.0.0-beta1) that has extension method called CosineSimilarity. It has three overloads:
Using float arrays the usage looks something like this:
The closer the result is to 1 the more similar the two items are.
最后一行应该是
The last line should be