详细字间距

发布于 2024-10-21 03:39:03 字数 159 浏览 7 评论 0原文

我将如何显示单词之间的详细距离。 例如,程序的输出可能是:

Words are "car" and "cure":
Replace "a" with "u".
Add "e".

Levenshtein 距离不能满足我的需求(我认为)。

How would I go about displaying detailed distance between words.
For example, the output of the program could be:

Words are "car" and "cure":
Replace "a" with "u".
Add "e".

The Levenshtein distance does not fulfill my needs (I think).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

ι不睡觉的鱼゛ 2024-10-28 03:39:03

尝试以下操作。该算法大致遵循维基百科(Levenshtein 距离)。下面使用的语言以ruby

为例,将s改为t的情况如下:

s = 'Sunday'
t = 'Saturday'

首先,s和t转为数组,并在开头插入一个空字符串。 m 最终将是算法中使用的矩阵。

s = ['', *s.split('')]
t = ['', *t.split('')]
m = Array.new(s.length){[]}

然而,这里的 m 与维基百科中的算法给出的矩阵不同,因为每个单元不仅包含 Levenshtein 距离,还包含(非)操作(开始不执行任何操作删除插入替换)用于从相邻(左、上或左上)单元格到达该单元格。它还可能包括描述操作参数的字符串。即每个单元格的格式为:

[Levenshtein distance, operation(, string)]

这里是主例程。它按照算法填充 m 的单元格:

s.each_with_index{|a, i| t.each_with_index{|b, j|
    m[i][j] =
    if i.zero?
        [j, "started"]
    elsif j.zero?
        [i, "started"]
    elsif a == b
        [m[i-1][j-1][0], "did nothing"]
    else
        del, ins, subs = m[i-1][j][0], m[i][j-1][0], m[i-1][j-1][0]
        case [del, ins, subs].min
        when del
            [del+1, "deleted", "'#{a}' at position #{i-1}"]
        when ins
            [ins+1, "inserted", "'#{b}' at position #{j-1}"]
        when subs
            [subs+1, "substituted", "'#{a}' at position #{i-1} with '#{b}'"]
        end
    end
}}

现在,我们将 ij 设置为 m 的右下角 并按照步骤向后执行,将单元格的内容取消移动到名为 steps 的数组中,直到到达开头。

i, j = s.length-1, t.length-1
steps = []
loop do
    case m[i][j][1]
    when "started"
        break
    when "did nothing", "substituted"
        steps.unshift(m[i-=1][j-=1])
    when "deleted"
        steps.unshift(m[i-=1][j])
    when "inserted"
        steps.unshift(m[i][j-=1])
    end
end

然后我们打印操作和每个步骤的字符串,除非那是非操作。

steps.each do |d, op, str=''|
    puts "#{op} #{str}" unless op == "did nothing" or op == "started"
end

对于这个特定的例子,它将输出:

inserted 'a' at position 1
inserted 't' at position 2
substituted 'n' at position 2 with 'r'

Try the following. The algorithm is roughly following Wikipedia (Levenshtein distance). The language used below is ruby

Use as an example, the case of changing s into t as follows:

s = 'Sunday'
t = 'Saturday'

First, s and t are turned into arrays, and an empty string is inserted at the beginning. m will eventually be the matrix used in the argorithm.

s = ['', *s.split('')]
t = ['', *t.split('')]
m = Array.new(s.length){[]}

m here, however, is different from the matrix given if the algorithm in wikipedia for the fact that each cell includes not only the Levenshtein distance, but also the (non-)operation (starting, doing nothing, deletion, insertion, or substitution) that was used to get to that cell from an adjacent (left, up, or upper-left) cell. It may also include a string describing the parameters of the operation. That is, the format of each cell is:

[Levenshtein distance, operation(, string)]

Here is the main routine. It fills in the cells of m following the algorithm:

s.each_with_index{|a, i| t.each_with_index{|b, j|
    m[i][j] =
    if i.zero?
        [j, "started"]
    elsif j.zero?
        [i, "started"]
    elsif a == b
        [m[i-1][j-1][0], "did nothing"]
    else
        del, ins, subs = m[i-1][j][0], m[i][j-1][0], m[i-1][j-1][0]
        case [del, ins, subs].min
        when del
            [del+1, "deleted", "'#{a}' at position #{i-1}"]
        when ins
            [ins+1, "inserted", "'#{b}' at position #{j-1}"]
        when subs
            [subs+1, "substituted", "'#{a}' at position #{i-1} with '#{b}'"]
        end
    end
}}

Now, we set i, j to the bottom-right corner of m and follow the steps backwards as we unshift the contents of the cell into an array called steps, until we reach the start.

i, j = s.length-1, t.length-1
steps = []
loop do
    case m[i][j][1]
    when "started"
        break
    when "did nothing", "substituted"
        steps.unshift(m[i-=1][j-=1])
    when "deleted"
        steps.unshift(m[i-=1][j])
    when "inserted"
        steps.unshift(m[i][j-=1])
    end
end

Then we print the operation and the string of each step unless that is a non-operation.

steps.each do |d, op, str=''|
    puts "#{op} #{str}" unless op == "did nothing" or op == "started"
end

With this particular example, it will output:

inserted 'a' at position 1
inserted 't' at position 2
substituted 'n' at position 2 with 'r'
同尘 2024-10-28 03:39:03
class Solution:
   def solve(self, text, word0, word1):
      word_list = text.split()
      ans = len(word_list)
      L = None
      for R in range(len(word_list)):
         if word_list[R] == word0 or word_list[R] == word1:
            if L is not None and word_list[R] != word_list[L]:
               ans = min(ans, R - L - 1)
               L = R
      return -1 if ans == len(word_list) else ans
ob = Solution()
text = "cat dog abcd dog cat cat abcd dog wxyz"
word0 = "abcd"
word1 = "wxyz"
print(ob.solve(text, word0, word1))
class Solution:
   def solve(self, text, word0, word1):
      word_list = text.split()
      ans = len(word_list)
      L = None
      for R in range(len(word_list)):
         if word_list[R] == word0 or word_list[R] == word1:
            if L is not None and word_list[R] != word_list[L]:
               ans = min(ans, R - L - 1)
               L = R
      return -1 if ans == len(word_list) else ans
ob = Solution()
text = "cat dog abcd dog cat cat abcd dog wxyz"
word0 = "abcd"
word1 = "wxyz"
print(ob.solve(text, word0, word1))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文