SAS 中的 Jaro-Winkler 字符串比较函数
Is there an implementation of the Jaro-Winkler string comparison in SAS?
It looks like Link King has Jaro-Winkler, but I'd prefer the flexibility of calling the function myself.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
据我所知,没有内置的 jaro-winkler 距离函数。 @Itzy 已经引用了我所知道的唯一的。如果您愿意的话,您可以使用 proc fcmp 来滚动您自己的函数。我什至会通过下面的代码为您提供一个良好的开端。我只是尝试按照维基百科上的文章进行操作。无论如何,它肯定不是 Bill Winkler 的 strcmp.c 文件的完美表示,并且可能有很多错误。
There is no built in function for jaro-winkler distance that I am aware of. @Itzy already reference the only ones that I know of. You can roll you own functions with
proc fcmp
though if you feel up to it. I'll even give you a head start with the code below. I just tried to follow the wikipedia article on it. It certainly isn't close to being a perfect representation of Bill Winkler's strcmp.c file by any means and likely has lots of bugs.我不这么认为。它可以执行 Levenshtein 距离(
complev
函数)或广义编辑距离(compged
),但我还没有看到任何其他编辑距离函数。如果您执意要在 SAS 中执行此操作,您可以在
PROC IML
中编写一个程序。I don't think so. It can do the Levenshtein distance (the
complev
function) or a generalized edit distance (compged
), but I haven't seen any other edit distance functions.If you're dead set on doing this in SAS you could write a program in
PROC IML
.我修改并纠正了 cmjohns 的代码。感谢他/她让我开始。温克勒发表
他的论文 Winkler, WE (2006) 中提供了一些例子。 “记录链接概述和当前
研究方向”。研究报告系列,RRS。(参见表 6)我使用示例来测试我的代码。
I modified and corrected cmjohns' code. Thanks to him/her for starting me off. Winkler published
some examples in his paper Winkler, W. E. (2006). "Overview of Record Linkage and Current
Research Directions". Research Report Series, RRS. (See table 6) I used the examples to test my code.