算法知道是否插入,替换或删除角色(类似于Levenshtein)

发布于 2025-01-24 10:58:57 字数 184 浏览 2 评论 0 原文

我想制作一个函数,以跟踪使一个字符串与另一个字符串相同的转换
示例:
a = batyu
B =美女
diff(a,b)必须返回:
[[1,“插入”,“ e”],[5,“ delete”],[3,“插入”,“ u”]] \

i使用levenshtein.editops,但我想编码执行此操作的函数

I want to make a function that keeps track of the transformations made to make one string identical to another one
Example:
A = batyu
B = beauty
diff(A,B) has to return:
[[1,"Insert", "e"], [5, "Delete"], [3, "Insert", "u"]]\

I used Levenshtein.editops but i want to code the function that does this

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

凑诗 2025-01-31 10:58:57

Wikipedia文章 levenshtein distres 可为您提供其使用的功能。现在轮到您在Python中实施它了。

如果您的代码不执行您期望的事情,请随时发布另一个问题,详细说明您的尝试,预期和为什么不起作用。

If you can read C you can also check out the implementation of editops

The wikipedia article for levenshtein distance gives you the function it uses. Now it's your turn to implement it in python.

If you have code that does not do what you expect it to, feel free to post another question detailing what you tried, what you expected and why it didn't work.

If you can read C you can also check out the implementation of editops.

七色彩虹 2025-01-31 10:58:57

您可以使用文档中的示例中的输出

a= 'adela'
b= 'adella'
dif = difflib.SequenceMatcher(None, a, b)
opcodes = dif.get_opcodes()
for tag, i1, i2, j1, j2 in opcodes:
    print('{:7}   a[{}:{}] --> b[{}:{}] {!r:>8} --> {!r}'.format(
        tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2]))

因此,请获取您的SequenceMatcher对象,然后迭代Opcodes并在opcodes上迭代并存储您想要的。我遇到了此搜索,以找到指向编辑文档的快速链接。出于我的目的,我以此为衡量了字符串的距离:

print(len([x for x in opcodes if x[0] != 'equal']))

You can use the output from the example in the documentation https://docs.python.org/3/library/difflib.html#difflib.SequenceMatcher.get_opcodes :

a= 'adela'
b= 'adella'
dif = difflib.SequenceMatcher(None, a, b)
opcodes = dif.get_opcodes()
for tag, i1, i2, j1, j2 in opcodes:
    print('{:7}   a[{}:{}] --> b[{}:{}] {!r:>8} --> {!r}'.format(
        tag, i1, i2, j1, j2, a[i1:i2], b[j1:j2]))

so get your sequencematcher object and then iterate over the opcodes and store however you want. I came across this searching for a quick link to the editops documentation. For my purpose I used this as a measure of how close the strings were:

print(len([x for x in opcodes if x[0] != 'equal']))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文