LevensteinDistance - Commons Lang 3.0 API
使用 Commons Lang api,我可以通过 LevensteinDistance。结果是将一个字符串更改为另一个字符串所需的更改次数。我希望结果在 0 到 1 的范围内,这样可以更容易地识别字符串之间的相似性。结果会更接近 0 相似度。是否可以?
下面是我正在使用的示例:
public class TesteLevenstein {
public static void main(String[] args) {
int distance1 = StringUtils.getLevenshteinDistance("Boat", "Coat");
int distance2 = StringUtils.getLevenshteinDistance("Remember", "Alamo");
int distance3 = StringUtils.getLevenshteinDistance("Steve", "Stereo");
System.out.println("distance(Boat, Coat): " + distance1);
System.out.println("distance(Remember, Alamo): " + distance2);
System.out.println("distance(Steve, Stereo): " + distance3);
}
}
谢谢!
With Commons Lang api I can calculate the similarity between two strings through the LevensteinDistance. The result is the number of changes needed to change one string into another. I wish the result was within the range from 0 to 1, where it would be easier to identify the similarity between the strings. The result would be closer to 0 great similarity. Is it possible?
Below the example I'm using:
public class TesteLevenstein {
public static void main(String[] args) {
int distance1 = StringUtils.getLevenshteinDistance("Boat", "Coat");
int distance2 = StringUtils.getLevenshteinDistance("Remember", "Alamo");
int distance3 = StringUtils.getLevenshteinDistance("Steve", "Stereo");
System.out.println("distance(Boat, Coat): " + distance1);
System.out.println("distance(Remember, Alamo): " + distance2);
System.out.println("distance(Steve, Stereo): " + distance3);
}
}
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只需除以某个数字即可。问题是多少?可能是给定字符串对的最大可能距离。我认为这是较长字符串的长度(即与较短字符串相比,所有字符都不同,加上添加了更多字符)。
Just divide by some number. The question is what number? Probably the maximum possible distance for the given pair of strings. I think that's the length of the longer string (ie all the characters are different, plus a few more were added, compared with the shorter string).