CoffeeScript 中的编辑距离公式?
我正在尝试创建或查找 Levenshtein Distance 公式(又名编辑距离)的 CoffeeScript 实现。这是我到目前为止所拥有的,任何帮助将不胜感激。
levenshtein = (s1,s2) ->
n = s1.length
m = s2.length
if n < m
return levenshtein(s2, s1)
if not s1
return s2.length
previous_row = [s2.length + 1]
for c1, i in s1
current_row = [i + 1]
for c2, j in s2
insertions = previous_row[j + 1] + 1
deletions = current_row[j] + 1
substitutions = previous_row[j] # is this unnescessary?-> (c1 != c2)
current_row.push(Math.min(insertions,deletions,substitutions))
previous_row = current_row
return previous_row[previous_row.length-1]
#End Levenshetein Function
顺便说一句:我知道这段代码在很多层面上都是错误的,我很高兴收到任何和所有建设性的批评。只是想改进,并找出这个公式!
CodeEdit1:修补了 Trevor 指出的错误,上面的当前代码包括这些更改
更新:我要问的问题是 - 我们如何在 CoffeeScript 中进行 Levenshtein?
以下是编辑距离算法的“步骤”,可帮助您了解我想要实现的目标。
步骤 1
将 n 设为 s 的长度。 将 m 设置为 t 的长度。 如果 n = 0,则返回 m 并退出。 如果 m = 0,则返回 n 并退出。 构造一个包含 0..m 行和 0..n 列的矩阵。
2
将第一行初始化为 0..n。 将第一列初始化为 0..m。
3 检查 s 的每个字符(i 从 1 到 n)。
4 检查 t(j 从 1 到 m)的每个字符。
5 如果 s[i] 等于 t[j],则成本为 0。 如果 s[i] 不等于 t[j],则成本为 1。
6 将矩阵的单元格 d[i,j] 设置为等于以下最小值: 一个。紧邻上方的单元格加 1:d[i-1,j] + 1。 b.紧邻左侧的单元格加 1:d[i,j-1] + 1。 c.左上方对角的单元格加上成本:d[i-1,j-1] + 成本。
7 迭代步骤 (3, 4, 5, 6) 完成后,在单元格 d[n,m] 中找到距离。
来源:http://www.merriampark.com/ld.htm
I am trying to create or find a CoffeeScript implementation of the Levenshtein Distance formula, aka Edit Distance. Here is what I have so far, any help at all would be much appreciated.
levenshtein = (s1,s2) ->
n = s1.length
m = s2.length
if n < m
return levenshtein(s2, s1)
if not s1
return s2.length
previous_row = [s2.length + 1]
for c1, i in s1
current_row = [i + 1]
for c2, j in s2
insertions = previous_row[j + 1] + 1
deletions = current_row[j] + 1
substitutions = previous_row[j] # is this unnescessary?-> (c1 != c2)
current_row.push(Math.min(insertions,deletions,substitutions))
previous_row = current_row
return previous_row[previous_row.length-1]
#End Levenshetein Function
Btw: I know this code is wrong on many levels, I am happy to receive any and all constructive criticism. Just looking to improve, and figure out this formula!
CodeEdit1: Patched up the errors Trevor pointed out, current code above includes those changes
Update: The question I am asking is - how do we do Levenshtein in CoffeeScript?
Here is the 'steps' for the Levenshtein Distance Algorithm to help you see what I am trying to accomplish.
Steps
1
Set n to be the length of s.
Set m to be the length of t.
If n = 0, return m and exit.
If m = 0, return n and exit.
Construct a matrix containing 0..m rows and 0..n columns.
2
Initialize the first row to 0..n.
Initialize the first column to 0..m.
3 Examine each character of s (i from 1 to n).
4 Examine each character of t (j from 1 to m).
5 If s[i] equals t[j], the cost is 0.
If s[i] doesn't equal t[j], the cost is 1.
6 Set cell d[i,j] of the matrix equal to the minimum of:
a. The cell immediately above plus 1: d[i-1,j] + 1.
b. The cell immediately to the left plus 1: d[i,j-1] + 1.
c. The cell diagonally above and to the left plus the cost: d[i-1,j-1] + cost.
7 After the iteration steps (3, 4, 5, 6) are complete, the distance is found in cell d[n,m].
source:http://www.merriampark.com/ld.htm
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
此页面(从您提到的资源链接到)提供了 Levenshtein 距离算法的 JavaScript 实现。基于此和您发布的代码,这是我的 CoffeeScript 版本:
它似乎可以通过简单测试,但如果有任何问题,请告诉我。
This page (linked to from the resource you mentioned) offers a JavaScript implementation of the Levenshtein distance algorithm. Based on both that and the code you posted, here's my CoffeeScript version:
It seems to hold up to light testing, but let me know if there are any problems.