找到最长非重叠序列的算法

发布于 2024-10-10 22:02:12 字数 894 浏览 14 评论 0原文

我正在尝试找到解决以下问题的最佳方法。我所说的最好的方式是指不太复杂。

作为输入的元组列表 (start,length)，如下所示：

[(0,5),(0,1),(1,9),(5,5),(5,7),(10,1)]

每个元素通过其 start 和 length 表示一个序列，例如 (5,7) 相当于序列 (5,6,7,8,9,10,11) - 从 5 开始的 7 个元素的列表。可以假设元组按 start 排序> 元素。

输出应返回表示最长连续序列的不重叠的元组组合。这意味着，解决方案是范围的子集，没有重叠和间隙，并且是最长的可能 - 但可能有多个。

例如，对于给定的输入，解决方案是：

[(0,5),(5,7)] 相当于 (0,1,2,3,4,5,6, 7,8,9,10,11)

回溯是解决这个问题的最佳方法吗？

我对人们建议的任何不同方法感兴趣。

另外，如果有人知道这个问题的正式参考资料或另一个类似的问题，我想获得参考资料。

顺便说一句 - 这不是家庭作业。

编辑

预期行为的另一个示例

只是为了避免一些错误，这是[(0,1),(1,7),(3,20),(8,5 等输入的 )] 正确答案是 [(3,20)] 相当于长度为 20 的 (3,4,5,..,22)。收到的一些答案将给出 < code>[(0,1),(1,7),(8,5)] 相当于 (0,1,2,...,11,12) 作为正确答案。但最后一个答案不正确，因为它比 [(3,20)] 短。

原文

I am trying to find the best way to solve the following problem. By best way I mean less complex.

As an input a list of tuples (start,length) such:

[(0,5),(0,1),(1,9),(5,5),(5,7),(10,1)]

Each element represets a sequence by its start and length, for example (5,7) is equivalent to the sequence (5,6,7,8,9,10,11) - a list of 7 elements starting with 5. One can assume that the tuples are sorted by the start element.

The output should return a non-overlapping combination of tuples that represent the longest continuous sequences(s). This means that, a solution is a subset of ranges with no overlaps and no gaps and is the longest possible - there could be more than one though.

For example for the given input the solution is:

[(0,5),(5,7)] equivalent to (0,1,2,3,4,5,6,7,8,9,10,11)

is it backtracking the best approach to solve this problem ?

I'm interested in any different approaches that people could suggest.

Also if anyone knows a formal reference of this problem or another one that is similar I'd like to get references.

BTW - this is not homework.

Edit

Just to avoid some mistakes this is another example of expected behaviour

for an input like [(0,1),(1,7),(3,20),(8,5)] the right answer is [(3,20)] equivalent to (3,4,5,..,22) with length 20. Some of the answers received would give [(0,1),(1,7),(8,5)] equivalent to (0,1,2,...,11,12) as right answer. But this last answer is not correct because is shorter than [(3,20)].

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

安静 2024-10-17 22:02:12

使用给定的顺序（按起始元素）迭代元组列表，同时使用哈希图来跟踪某个索引上最长连续序列 ending 的长度。

伪代码，跳过诸如在哈希图中未找到的项之类的细节（假设如果未找到则返回 0）：

int bestEnd = 0;
hashmap<int,int> seq // seq[key] = length of the longest sequence ending on key-1, or 0 if not found
foreach (tuple in orderedTuples) {
    int seqLength = seq[tuple.start] + tuple.length
    int tupleEnd = tuple.start+tuple.length;
    seq[tupleEnd] = max(seq[tupleEnd], seqLength)
    if (seqLength > seq[bestEnd]) bestEnd = tupleEnd
}
return new tuple(bestEnd-seq[bestEnd], seq[bestEnd])

这是一个 O(N) 算法。

如果您需要组成此序列的实际元组，则还需要保留按结束索引散列的元组链接列表，每当更新此端点的最大长度时更新此列表。

更新：我对 python 的了解相当有限，但是根据您粘贴的 python 代码，我创建了这个返回实际序列而不仅仅是长度的代码：

def get_longest(arr):
    bestEnd = 0;
    seqLengths = dict() #seqLengths[key] = length of the longest sequence ending on key-1, or 0 if not found
    seqTuples = dict() #seqTuples[key] = the last tuple used in this longest sequence
    for t in arr:
        seqLength = seqLengths.get(t[0],0) + t[1]
        tupleEnd = t[0] + t[1]
        if (seqLength > seqLengths.get(tupleEnd,0)):
            seqLengths[tupleEnd] = seqLength
            seqTuples[tupleEnd] = t
            if seqLength > seqLengths.get(bestEnd,0):
                bestEnd = tupleEnd
    longestSeq = []
    while (bestEnd in seqTuples):
        longestSeq.append(seqTuples[bestEnd])
        bestEnd -= seqTuples[bestEnd][1]
    longestSeq.reverse()
    return longestSeq


if __name__ == "__main__":
    a = [(0,3),(1,4),(1,1),(1,8),(5,2),(5,5),(5,6),(10,2)]
    print(get_longest(a))

Iterate over the list of tuples using the given ordering (by start element), while using a hashmap to keep track of the length of the longest continuous sequence ending on a certain index.

pseudo-code, skipping details like items not found in a hashmap (assume 0 returned if not found):

int bestEnd = 0;
hashmap<int,int> seq // seq[key] = length of the longest sequence ending on key-1, or 0 if not found
foreach (tuple in orderedTuples) {
    int seqLength = seq[tuple.start] + tuple.length
    int tupleEnd = tuple.start+tuple.length;
    seq[tupleEnd] = max(seq[tupleEnd], seqLength)
    if (seqLength > seq[bestEnd]) bestEnd = tupleEnd
}
return new tuple(bestEnd-seq[bestEnd], seq[bestEnd])

This is an O(N) algorithm.

If you need the actual tuples making up this sequence, you'd need to keep a linked list of tuples hashed by end index as well, updating this whenever the max length is updated for this end-point.

UPDATE: My knowledge of python is rather limited, but based on the python code you pasted, I created this code that returns the actual sequence instead of just the length:

def get_longest(arr):
    bestEnd = 0;
    seqLengths = dict() #seqLengths[key] = length of the longest sequence ending on key-1, or 0 if not found
    seqTuples = dict() #seqTuples[key] = the last tuple used in this longest sequence
    for t in arr:
        seqLength = seqLengths.get(t[0],0) + t[1]
        tupleEnd = t[0] + t[1]
        if (seqLength > seqLengths.get(tupleEnd,0)):
            seqLengths[tupleEnd] = seqLength
            seqTuples[tupleEnd] = t
            if seqLength > seqLengths.get(bestEnd,0):
                bestEnd = tupleEnd
    longestSeq = []
    while (bestEnd in seqTuples):
        longestSeq.append(seqTuples[bestEnd])
        bestEnd -= seqTuples[bestEnd][1]
    longestSeq.reverse()
    return longestSeq


if __name__ == "__main__":
    a = [(0,3),(1,4),(1,1),(1,8),(5,2),(5,5),(5,6),(10,2)]
    print(get_longest(a))

回复收藏 0 原文

少女的英雄梦 2024-10-17 22:02:12

修改后的算法：

create a hashtable of start->list of tuples that start there
put all tuples in a queue of tupleSets
set the longestTupleSet to the first tuple
while the queue is not empty
    take tupleSet from the queue
    if any tuples start where the tupleSet ends
        foreach tuple that starts where the tupleSet ends
            enqueue new tupleSet of tupleSet + tuple
        continue

    if tupleSet is longer than longestTupleSet
        replace longestTupleSet with tupleSet

return longestTupleSet

c#实现

public static IList<Pair<int, int>> FindLongestNonOverlappingRangeSet(IList<Pair<int, int>> input)
{
    var rangeStarts = input.ToLookup(x => x.First, x => x);
    var adjacentTuples = new Queue<List<Pair<int, int>>>(
        input.Select(x => new List<Pair<int, int>>
            {
                x
            }));

    var longest = new List<Pair<int, int>>
        {
            input[0]
        };
    int longestLength = input[0].Second - input[0].First;

    while (adjacentTuples.Count > 0)
    {
        var tupleSet = adjacentTuples.Dequeue();
        var last = tupleSet.Last();
        int end = last.First + last.Second;
        var sameStart = rangeStarts[end];
        if (sameStart.Any())
        {
            foreach (var nextTuple in sameStart)
            {
                adjacentTuples.Enqueue(tupleSet.Concat(new[] { nextTuple }).ToList());
            }
            continue;
        }
        int length = end - tupleSet.First().First;
        if (length > longestLength)
        {
            longestLength = length;
            longest = tupleSet;
        }
    }

    return longest;
}

测试：

[Test]
public void Given_the_first_problem_sample()
{
    var input = new[]
        {
            new Pair<int, int>(0, 5),
            new Pair<int, int>(0, 1),
            new Pair<int, int>(1, 9),
            new Pair<int, int>(5, 5),
            new Pair<int, int>(5, 7),
            new Pair<int, int>(10, 1)
        };
    var result = FindLongestNonOverlappingRangeSet(input);
    result.Count.ShouldBeEqualTo(2);
    result.First().ShouldBeSameInstanceAs(input[0]);
    result.Last().ShouldBeSameInstanceAs(input[4]);
}

[Test]
public void Given_the_second_problem_sample()
{
    var input = new[]
        {
            new Pair<int, int>(0, 1),
            new Pair<int, int>(1, 7),
            new Pair<int, int>(3, 20),
            new Pair<int, int>(8, 5)
        };
    var result = FindLongestNonOverlappingRangeSet(input);
    result.Count.ShouldBeEqualTo(1);
    result.First().ShouldBeSameInstanceAs(input[2]);
}

Revised algorithm:

create a hashtable of start->list of tuples that start there
put all tuples in a queue of tupleSets
set the longestTupleSet to the first tuple
while the queue is not empty
    take tupleSet from the queue
    if any tuples start where the tupleSet ends
        foreach tuple that starts where the tupleSet ends
            enqueue new tupleSet of tupleSet + tuple
        continue

    if tupleSet is longer than longestTupleSet
        replace longestTupleSet with tupleSet

return longestTupleSet

c# implementation

public static IList<Pair<int, int>> FindLongestNonOverlappingRangeSet(IList<Pair<int, int>> input)
{
    var rangeStarts = input.ToLookup(x => x.First, x => x);
    var adjacentTuples = new Queue<List<Pair<int, int>>>(
        input.Select(x => new List<Pair<int, int>>
            {
                x
            }));

    var longest = new List<Pair<int, int>>
        {
            input[0]
        };
    int longestLength = input[0].Second - input[0].First;

    while (adjacentTuples.Count > 0)
    {
        var tupleSet = adjacentTuples.Dequeue();
        var last = tupleSet.Last();
        int end = last.First + last.Second;
        var sameStart = rangeStarts[end];
        if (sameStart.Any())
        {
            foreach (var nextTuple in sameStart)
            {
                adjacentTuples.Enqueue(tupleSet.Concat(new[] { nextTuple }).ToList());
            }
            continue;
        }
        int length = end - tupleSet.First().First;
        if (length > longestLength)
        {
            longestLength = length;
            longest = tupleSet;
        }
    }

    return longest;
}

tests:

[Test]
public void Given_the_first_problem_sample()
{
    var input = new[]
        {
            new Pair<int, int>(0, 5),
            new Pair<int, int>(0, 1),
            new Pair<int, int>(1, 9),
            new Pair<int, int>(5, 5),
            new Pair<int, int>(5, 7),
            new Pair<int, int>(10, 1)
        };
    var result = FindLongestNonOverlappingRangeSet(input);
    result.Count.ShouldBeEqualTo(2);
    result.First().ShouldBeSameInstanceAs(input[0]);
    result.Last().ShouldBeSameInstanceAs(input[4]);
}

[Test]
public void Given_the_second_problem_sample()
{
    var input = new[]
        {
            new Pair<int, int>(0, 1),
            new Pair<int, int>(1, 7),
            new Pair<int, int>(3, 20),
            new Pair<int, int>(8, 5)
        };
    var result = FindLongestNonOverlappingRangeSet(input);
    result.Count.ShouldBeEqualTo(1);
    result.First().ShouldBeSameInstanceAs(input[2]);
}

回复收藏 0 原文

空城缀染半城烟沙 2024-10-17 22:02:12

这是加权有向无环图的最长路径问题的特例。

图中的节点是序列中最后一个元素之后的起点和点，下一个序列可以从这里开始。

这个问题很特殊，因为无论路径如何，两个节点之间的距离必须相同。

回复收藏 0 原文

花海 2024-10-17 22:02:12

只从基本角度考虑算法，这行得通吗？

（对糟糕的语法表示歉意，但我试图在这里保持语言独立）

首先是最简单的形式：找到最长的连续对。

循环浏览每个成员并将其与具有较高起始位置的其他每个成员进行比较。如果第二个成员的 startpos 等于第一个成员的 startpos 和长度之和，则它们是连续的。如果是这样，则在新集合中形成一个新成员，并使用较低的起始位置和组合长度来表示这一点。

然后，将这些对中的每一个与具有较高起始位置的所有单个成员进行比较并重复，形成一组新的连续三元组（如果存在）。

继续这个模式，直到没有新的集合。

棘手的部分是你必须比较每个集合中每个成员的长度才能找到真正最长的链。

我很确定这不如其他方法有效，但我相信这是暴力破解此解决方案的可行方法。

如果您对此以及我可能忽略的任何错误提供反馈，我将不胜感激。

回复收藏 0 原文

池木 2024-10-17 22:02:12

编辑以用实际的 Python 代码替换伪代码

再次编辑以更改代码；原始算法是关于解决方案的，但我误解了对中的第二个值是什么！幸运的是，基本算法是相同的，并且我能够更改它。

这是一个以 O(N log N) 解决问题的想法，并且不使用哈希映射（因此没有隐藏时间）。对于内存，我们将使用 N * 2 个“事物”。

我们将向每个元组添加两个以上的值：（BackCount、BackLink）。在成功的组合中，BackLink 将从右到左从最右边的元组链接到最左边的元组。 BackCount 将是给定 BackLink 的累计计数值。

下面是一些 python 代码：

def FindTuplesStartingWith(tuples, frm):
    # The Log(N) algorithm is left as an excersise for the user
    ret=[]
    for i in range(len(tuples)):
        if (tuples[i][0]==frm): ret.append(i)
    return ret

def FindLongestSequence(tuples):

    # Prepare (BackCount, BackLink) array
    bb=[] # (BackCount, BackLink)
    for OneTuple in tuples: bb.append((-1,-1))

    # Prepare
    LongestSequenceLen=-1
    LongestSequenceTail=-1

    # Algorithm
    for i in range(len(tuples)):
        if (bb[i][0] == -1): bb[i] = (0, bb[i][1])
        # Is this single pair the longest possible pair all by itself?
        if (tuples[i][1] + bb[i][0]) > LongestSequenceLen:
            LongestSequenceLen = tuples[i][1] + bb[i][0]
            LongestSequenceTail = i
        # Find next segment
        for j in FindTuplesStartingWith(tuples, tuples[i][0] + tuples[i][1]):
            if ((bb[j][0] == -1) or (bb[j][0] < (bb[i][0] + tuples[i][1]))):
                # can be linked
                bb[j] = (bb[i][0] + tuples[i][1], i)
                if ((bb[j][0] + tuples[j][1]) > LongestSequenceLen):
                    LongestSequenceLen = bb[j][0] + tuples[j][1]
                    LongestSequenceTail=j

    # Done! I'll now build up the solution
    ret=[]
    while (LongestSequenceTail > -1):
        ret.insert(0, tuples[LongestSequenceTail])
        LongestSequenceTail = bb[LongestSequenceTail][1]
    return ret

# Call the algoritm
print FindLongestSequence([(0,5), (0,1), (1,9), (5,5), (5,7), (10,1)])
>>>>>> [(0, 5), (5, 7)]
print FindLongestSequence([(0,1), (1,7), (3,20), (8,5)])    
>>>>>> [(3, 20)]

整个算法的关键是代码中“THIS IS THE KEY”注释的位置。我们知道当前的 StartTuple 可以链接到 EndTuple。如果存在以 EndTuple.To 结尾的较长序列，则在我们到达此点时已找到它，因为它必须从较小的 StartTuple.From 开始，并且数组按“From”排序！

Edited to replace pseudocode with actual Python code

Edited AGAIN to change the code; The original algorithm was on the solution, but I missunderstood what the second value in the pairs was! Fortunatelly the basic algorithm is the same, and I was able to change it.

Here's an idea that solves the problem in O(N log N) and doesn't use a hash map (so no hidden times). For memory we're going to use N * 2 "things".

We're going to add two more values to each tuple: (BackCount, BackLink). In the successful combination BackLink will link from right to left from the right-most tuple to the left-most tuple. BackCount will be the value accumulated count for the given BackLink.

Here's some python code:

def FindTuplesStartingWith(tuples, frm):
    # The Log(N) algorithm is left as an excersise for the user
    ret=[]
    for i in range(len(tuples)):
        if (tuples[i][0]==frm): ret.append(i)
    return ret

def FindLongestSequence(tuples):

    # Prepare (BackCount, BackLink) array
    bb=[] # (BackCount, BackLink)
    for OneTuple in tuples: bb.append((-1,-1))

    # Prepare
    LongestSequenceLen=-1
    LongestSequenceTail=-1

    # Algorithm
    for i in range(len(tuples)):
        if (bb[i][0] == -1): bb[i] = (0, bb[i][1])
        # Is this single pair the longest possible pair all by itself?
        if (tuples[i][1] + bb[i][0]) > LongestSequenceLen:
            LongestSequenceLen = tuples[i][1] + bb[i][0]
            LongestSequenceTail = i
        # Find next segment
        for j in FindTuplesStartingWith(tuples, tuples[i][0] + tuples[i][1]):
            if ((bb[j][0] == -1) or (bb[j][0] < (bb[i][0] + tuples[i][1]))):
                # can be linked
                bb[j] = (bb[i][0] + tuples[i][1], i)
                if ((bb[j][0] + tuples[j][1]) > LongestSequenceLen):
                    LongestSequenceLen = bb[j][0] + tuples[j][1]
                    LongestSequenceTail=j

    # Done! I'll now build up the solution
    ret=[]
    while (LongestSequenceTail > -1):
        ret.insert(0, tuples[LongestSequenceTail])
        LongestSequenceTail = bb[LongestSequenceTail][1]
    return ret

# Call the algoritm
print FindLongestSequence([(0,5), (0,1), (1,9), (5,5), (5,7), (10,1)])
>>>>>> [(0, 5), (5, 7)]
print FindLongestSequence([(0,1), (1,7), (3,20), (8,5)])    
>>>>>> [(3, 20)]

The key for the whole algorithm is where the "THIS IS THE KEY" comment is in the code. We know our current StartTuple can be linked to EndTuple. If a longer sequence that ends at EndTuple.To exists, it was found by the time we got to this point, because it had to start at an smaller StartTuple.From, and the array is sorted on "From"!

回复收藏 0 原文

Oo萌小芽oO 2024-10-17 22:02:12

我删除了之前的解决方案，因为它未经测试。

问题是找到“加权有向无环图”中的最长路径，它可以在线性时间内解决：

http://en.wikipedia.org/wiki/Longest_path_problem#Weighted_directed_acircular_graphs

将一组{起始位置}并集{(起始位置+结束位置)}作为顶点。对于您的示例，

如果存在使 v0 + w = v1 的最终值 w，则顶点 v0、v1 的值为 {0, 1, 5, 10, 11, 12}，然后添加一条连接 v0 到 v1 的有向边，然后将w 为其重量。

现在按照维基百科页面中的伪代码进行操作。由于顶点数是最大值2xn（n是元组数），所以问题仍然可以在线性时间内解决。

回复收藏 0 原文

千紇 2024-10-17 22:02:12

这是一个简单的归约操作。给定一对连续的元组，它们要么可以组合，要么不能组合。因此，定义成对组合函数：

def combo(first,second):
    if first[0]+first[1] == second[0]:
        return [(first[0],first[1]+second[1])]
    else:
        return [first,second]

这仅返回组合两个参数的一个元素或原始两个元素的列表。

然后定义一个函数来迭代第一个列表并组合对：

def collapse(tupleList):
    first = tupleList.pop(0)
    newList = []
    for item in tupleList:
        collapsed = combo(first,item)
        if len(collapsed)==2:
            newList.append(collapsed[0])
        first = collapsed.pop()
    newList.append(first)
    return newList

这将保留第一个元素与列表中的当前项目进行比较（从第二个项目开始），当它无法组合它们时，它将第一个元素放入新列表并将 first 替换为两者中的第二个。

然后只需使用元组列表调用 collapse 即可：

>>> collapse( [(5, 7), (12, 3), (0, 5), (0, 7), (7, 2), (9, 3)] )
[(5, 10), (0, 5), (0, 12)]

[编辑] 最后，迭代结果以获得最长的序列。

def longest(seqs):
    collapsed = collapse(seqs)
    return max(collapsed, key=lambda x: x[1])

[/编辑]

复杂度 O(N)。对于奖励标记，请反向执行，以便初始 pop(0) 变为 pop()，并且您不必重新索引数组或移动迭代器反而。对于最高分，请使其作为成对 reduce 操作运行，以实现多线程的优点。

This is a simple reduce operation. Given a pair of consecutive tuples, they either can or can't be combined. So define the pairwise combination function:

def combo(first,second):
    if first[0]+first[1] == second[0]:
        return [(first[0],first[1]+second[1])]
    else:
        return [first,second]

This just returns a list of either one element combining the two arguments, or the original two elements.

Then define a function to iterate over the first list and combine pairs:

def collapse(tupleList):
    first = tupleList.pop(0)
    newList = []
    for item in tupleList:
        collapsed = combo(first,item)
        if len(collapsed)==2:
            newList.append(collapsed[0])
        first = collapsed.pop()
    newList.append(first)
    return newList

This keeps a first element to compare with the current item in the list (starting at the second item), and when it can't combine them it drops the first into a new list and replaces first with the second of the two.

Then just call collapse with the list of tuples:

>>> collapse( [(5, 7), (12, 3), (0, 5), (0, 7), (7, 2), (9, 3)] )
[(5, 10), (0, 5), (0, 12)]

[Edit] Finally, iterate over the result to get the longest sequence.

def longest(seqs):
    collapsed = collapse(seqs)
    return max(collapsed, key=lambda x: x[1])

[/Edit]

Complexity O(N). For bonus marks, do it in reverse so that the initial pop(0) becomes a pop() and you don't have to reindex the array, or move the iterator instead. For top marks make it run as a pairwise reduce operation for multithreaded goodness.

回复收藏 0 原文

不乱于心 2024-10-17 22:02:12

这听起来像是一个完美的“动态编程”问题...

最简单的程序是用蛮力（例如递归）来实现，但这具有指数复杂性。

通过动态编程，您可以设置长度为 n 的数组 a，其中 n 是问题的所有（起始+长度）值的最大值，其中 a[i] 表示最长的非重叠序列，直到 a[i]。然后，您可以逐步遍历所有元组，更新 a。该算法的复杂度为 O(n*k)，其中 k 是输入值的数量。

回复收藏 0 原文

橙味迷妹 2024-10-17 22:02:12

创建所有起点和终点的有序数组，并将它们全部初始化为 1
对于元组中的每个项目，将终点（起点和终点）与数组中的有序项目进行比较，如果它们之间有任何点（例如点数组中的值是 5，并且起始值是 2，长度是 4）将值更改为零。
完成循环后，开始在有序数组中移动，并在看到 1 时创建一个条带，当您看到 1 时，添加到现有条带（任意为零）、关闭条带等。
最后检查条带的长度

I认为复杂度大约为 O(4-5*N)