java 是否有支持 Annotate/Blame 的 Diff 库?

发布于 2024-10-07 10:49:19 字数 623 浏览 8 评论 0原文

我正在挖掘 Google 的免费(开源)Java diff 库的结果,似乎有很多这样的库(其中一些甚至可以使用通用对象,而不仅仅是字符串)。

在我挖掘大量搜索结果但找不到我要搜索的内容之前,我首先会问:

这些 diff 库是否支持 cvs annotate 或 svnblame 等功能。我想

  • 将当前的 String[] 传递给函数,
  • 继续将旧版本的 String[] 传递给函数,直到我用完所有版本,或者该库告诉我,没有原始行被注释掉(最后一件事并不是真正必须的,但非常有用,因为检索 String[] 的旧版本非常昂贵,所以我想尽早停止尽可能)
  • 调用一个函数,它给我一个 ìnt[] ,它告诉我当前版本的每一行,最后一次更改的版本或是否根本没有更改(即最后更改在第一个版本中)。

支持非 String 对象固然很好,但不是必须的。如果 API 不完全是这样,我想我可以接受它。

如果没有,任何人都可以建议一个可扩展的差异库,可以轻松添加该功能,最好是愿意接收该功能作为贡献的库(并且在接受贡献之前不需要填写大量的文书工作,例如GNU 项目)?那么我会自愿(至少尝试)将其添加到那里。

I'm digging through Google's results for free (open source) Java diff libaries, and there seem quite a bunch of those (some of them even working with generic Objects and not only with Strings).

Before I'm digging through tons of search results and not finding what I'm searching, I'll ask here first:

Does any of those diff libraries support a feature like cvs annotate or svn blame. I want to

  • pass the current String[] to a function
  • continue passing older versions of the String[] to a function, until either I have used up all of them, or the library tells me that no original line was left unannotated (the last thing is not really a must but very useful since retrieving older versions of the String[] is expensive so I'd like to stop as early as possible)
  • call a function which gives me an ìnt[] that tells me for every line of the current version, in which version it was last changed or whether it was not changed at all (i. e. last changed in the very first version).

Having support for objects that are not Strings is nice, but no must. And if the API is not exactly that way, I guess I could live with it.

If there is none, can anybody suggest an extensible diff library where that feature can be added easily, preferrably one that would like to receive that feature as a contribution (and does not require tons of paperwork to be filled before they accept contributions, like the GNU project)? I'd volunteer to (at least try to) add it there, then.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

悲欢浪云 2024-10-14 10:49:19

我决定自己为 Dmitry Naumenko 的 java-diff-utils 库实现它:

/*
   Copyright 2010 Michael Schierl ([email protected])

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 */
package difflib.annotate;

import java.util.*;

import difflib.*;

/**
 * Generates an annotated version of a revision based on a list of older
 * revisions, like <tt>cvs annotate</tt> or <tt>svn blame</tt>.
 * 
 * @author <a href="[email protected]">Michael Schierl</a>
 * 
 * @param <R>
 *            Type of the revision metadata
 */
public class Annotate<R> {

    private final List<R> revisions;
    private final int[] lineNumbers;
    private R currentRevision;
    private final List<Object> currentLines;
    private final List<Integer> currentLineMap;

    /**
     * Creates a new annotation generator.
     * 
     * @param revision
     *            Revision metadata for the revision to be annotated
     * @param targetLines
     *            Lines of the revision to be annotated
     */
    public Annotate(R revision, List<?> targetLines) {
        revisions = new ArrayList<R>();
        lineNumbers = new int[targetLines.size()];
        currentRevision = revision;
        currentLines = new ArrayList<Object>(targetLines);
        currentLineMap = new ArrayList<Integer>();
        for (int i = 0; i < lineNumbers.length; i++) {
            lineNumbers[i] = -1;
            revisions.add(null);
            currentLineMap.add(i);
        }
    }

    /**
     * Check whether there are still lines that are unannotated. In that case,
     * more older revisions should be retrieved and passed to the function. Note
     * that as soon as you pass an empty revision, all lines will be annotated
     * (with a later revision), therefore if you do not have any more revisions,
     * pass an empty revision to annotate the rest of the lines.
     */
    public boolean areLinesUnannotated() {
        for (int i = 0; i < lineNumbers.length; i++) {
            if (lineNumbers[i] == -1 || revisions.get(i) == null)
                return true;
        }
        return false;
    }

    /**
     * Add the previous revision and update annotation info.
     * 
     * @param revision
     *            Revision metadata for this revision
     * @param lines
     *            Lines of this revision
     */
    public void addRevision(R revision, List<?> lines) {
        Patch patch = DiffUtils.diff(currentLines, lines);
        int lineOffset = 0; // remember number of already deleted/added lines
        for (Delta d : patch.getDeltas()) {
            Chunk original = d.getOriginal();
            Chunk revised = d.getRevised();
            int pos = original.getPosition() + lineOffset;
            // delete lines
            for (int i = 0; i < original.getSize(); i++) {
                int origLine = currentLineMap.remove(pos);
                currentLines.remove(pos);
                if (origLine != -1) {
                    lineNumbers[origLine] = original.getPosition() + i;
                    revisions.set(origLine, currentRevision);
                }
            }
            for (int i = 0; i < revised.getSize(); i++) {
                currentLines.add(pos + i, revised.getLines().get(i));
                currentLineMap.add(pos + i, -1);
            }
            lineOffset += revised.getSize() - original.getSize();
        }

        currentRevision = revision;
        if (!currentLines.equals(lines))
            throw new RuntimeException("Patch application failed");
    }

    /**
     * Return the result of the annotation. It will be a List of the same length
     * as the target revision, for which every entry states the revision where
     * the line appeared last.
     */
    public List<R> getAnnotatedRevisions() {
        return Collections.unmodifiableList(revisions);
    }

    /**
     * Return the result of the annotation. It will be a List of the same length
     * as the target revision, for which every entry states the line number in
     * the revision where the line appeared last.
     */
    public int[] getAnnotatedLineNumbers() {
        return (int[]) lineNumbers.clone();
    }
}

我还将其发送给 Dmitry Naumenko(带有一些测试用例),以防他想将其包含在内。

I decided to implement it myself for Dmitry Naumenko's java-diff-utils library:

/*
   Copyright 2010 Michael Schierl ([email protected])

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
 */
package difflib.annotate;

import java.util.*;

import difflib.*;

/**
 * Generates an annotated version of a revision based on a list of older
 * revisions, like <tt>cvs annotate</tt> or <tt>svn blame</tt>.
 * 
 * @author <a href="[email protected]">Michael Schierl</a>
 * 
 * @param <R>
 *            Type of the revision metadata
 */
public class Annotate<R> {

    private final List<R> revisions;
    private final int[] lineNumbers;
    private R currentRevision;
    private final List<Object> currentLines;
    private final List<Integer> currentLineMap;

    /**
     * Creates a new annotation generator.
     * 
     * @param revision
     *            Revision metadata for the revision to be annotated
     * @param targetLines
     *            Lines of the revision to be annotated
     */
    public Annotate(R revision, List<?> targetLines) {
        revisions = new ArrayList<R>();
        lineNumbers = new int[targetLines.size()];
        currentRevision = revision;
        currentLines = new ArrayList<Object>(targetLines);
        currentLineMap = new ArrayList<Integer>();
        for (int i = 0; i < lineNumbers.length; i++) {
            lineNumbers[i] = -1;
            revisions.add(null);
            currentLineMap.add(i);
        }
    }

    /**
     * Check whether there are still lines that are unannotated. In that case,
     * more older revisions should be retrieved and passed to the function. Note
     * that as soon as you pass an empty revision, all lines will be annotated
     * (with a later revision), therefore if you do not have any more revisions,
     * pass an empty revision to annotate the rest of the lines.
     */
    public boolean areLinesUnannotated() {
        for (int i = 0; i < lineNumbers.length; i++) {
            if (lineNumbers[i] == -1 || revisions.get(i) == null)
                return true;
        }
        return false;
    }

    /**
     * Add the previous revision and update annotation info.
     * 
     * @param revision
     *            Revision metadata for this revision
     * @param lines
     *            Lines of this revision
     */
    public void addRevision(R revision, List<?> lines) {
        Patch patch = DiffUtils.diff(currentLines, lines);
        int lineOffset = 0; // remember number of already deleted/added lines
        for (Delta d : patch.getDeltas()) {
            Chunk original = d.getOriginal();
            Chunk revised = d.getRevised();
            int pos = original.getPosition() + lineOffset;
            // delete lines
            for (int i = 0; i < original.getSize(); i++) {
                int origLine = currentLineMap.remove(pos);
                currentLines.remove(pos);
                if (origLine != -1) {
                    lineNumbers[origLine] = original.getPosition() + i;
                    revisions.set(origLine, currentRevision);
                }
            }
            for (int i = 0; i < revised.getSize(); i++) {
                currentLines.add(pos + i, revised.getLines().get(i));
                currentLineMap.add(pos + i, -1);
            }
            lineOffset += revised.getSize() - original.getSize();
        }

        currentRevision = revision;
        if (!currentLines.equals(lines))
            throw new RuntimeException("Patch application failed");
    }

    /**
     * Return the result of the annotation. It will be a List of the same length
     * as the target revision, for which every entry states the revision where
     * the line appeared last.
     */
    public List<R> getAnnotatedRevisions() {
        return Collections.unmodifiableList(revisions);
    }

    /**
     * Return the result of the annotation. It will be a List of the same length
     * as the target revision, for which every entry states the line number in
     * the revision where the line appeared last.
     */
    public int[] getAnnotatedLineNumbers() {
        return (int[]) lineNumbers.clone();
    }
}

I also sent it to Dmitry Naumenko (with a few test cases) in case he wants to include it.

迷雾森÷林ヴ 2024-10-14 10:49:19

我可能是错的,但我认为注释/责备需要版本控制系统才能工作,因为它需要访问文件的历史记录。通用差异库无法做到这一点。
因此,如果这是您的目标,请检查与这些 VCS 配合使用的库,例如 svnkit
如果没有,这样的库可能是如何完成注释/责备的一个很好的起点,这通常涉及比较文件所有版本的链。

I might be wrong but I think annotate/blame needs a version control system to work, since it needs to access the history of the file. A generic diff-library can't do that.
So if this is your target check out the libraries that work with those VCS, like svnkit.
If not, such a library may be a good starting point on how annotate/blame is done, very often this involves diff'ing the chain of all versions of a file.

酒浓于脸红 2024-10-14 10:49:19

您可以使用
xwiki-commons-blame-api
它实际上使用此线程接受的答案中的代码感谢 Michael Schierl 在 StackOverflow 上分享此代码

您可以在 这是单元测试。< /a>

或者在 Scala 中:

import java.util
import org.xwiki.blame.AnnotatedContent
import org.xwiki.blame.internal.DefaultBlameManager

case class Revision(id: Int,
                    parentId: Option[Int] = None,
                    content: Option[String] = None)

def createAnnotation(revisions: Seq[Revision]): Option[AnnotatedContent[Revision, String]] = {
    val blameManager = new DefaultBlameManager()

    val annotatedContent = revisions.foldLeft(null.asInstanceOf[AnnotatedContent[Revision, String]]){
      (annotation, revision) =>
        blameManager.blame(annotation, revision, splitByWords(revision.content))
    }
    Option(annotatedContent)
}

def splitByWords(content: Option[String]): util.List[String] = {
    val array = content.fold(Array[String]())(_.split("[^\\pL_\\pN]+"))
    util.Arrays.asList(array:_*)
}

You can use
xwiki-commons-blame-api.
It actually uses code from this thread's accepted answer (Thanks to Michael Schierl for sharing this code on StackOverflow)

You can see how to use it in Java in it's unit tests.

Or in Scala like:

import java.util
import org.xwiki.blame.AnnotatedContent
import org.xwiki.blame.internal.DefaultBlameManager

case class Revision(id: Int,
                    parentId: Option[Int] = None,
                    content: Option[String] = None)

def createAnnotation(revisions: Seq[Revision]): Option[AnnotatedContent[Revision, String]] = {
    val blameManager = new DefaultBlameManager()

    val annotatedContent = revisions.foldLeft(null.asInstanceOf[AnnotatedContent[Revision, String]]){
      (annotation, revision) =>
        blameManager.blame(annotation, revision, splitByWords(revision.content))
    }
    Option(annotatedContent)
}

def splitByWords(content: Option[String]): util.List[String] = {
    val array = content.fold(Array[String]())(_.split("[^\\pL_\\pN]+"))
    util.Arrays.asList(array:_*)
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文