当前位置：文江博客话题详情

分段最小二乘动态规划算法

发布于 2024-09-30 09:15:02 字数 1097 浏览 8 评论 0原文

几天来我一直在尝试用 Python 实现这个算法。我不断地回到这个问题上，然后放弃并感到沮丧。我不知道发生了什么事。我没有可以向任何人求助，也没有地方可以寻求帮助，所以我来到了这里。

Mahesh Viswanathan - CS 473ug：算法 - 动态规划

PDF 警告：http://www.cs.uiuc.edu/class/sp08/cs473/Lectures/lec10.pdf

我认为它没有明确解释，我肯定不明白。

我对发生的事情的理解是这样的：

我们有一组点 (x1,y1), (x2,y2).. 我们想要找到一些最适合这些数据的线。我们可以有多条直线，这些直线来自 a 和 b (y = ax +b) 的给定论坛。

现在，算法从末尾 (?) 开始，并假设点 p(x_i, y_i) 是线段的一部分。然后注释说最优解是“是 {p1,... 的最优解”。。。 pi−1} 加（最佳）线穿过 {pi , . 。。 pn}'。这对我来说意味着我们到达点 p(x_i, y_i) 并向后移动并通过其余点找到另一条线段。现在最优解就是这两条线段。

然后它需要一个我无法遵循的逻辑跳转，并说“假设最后一个点 pn 是从 p_i 开始的段的一部分。如果 Opt(j) 表示前 j 点的成本，并且 e(j,k)这通过点 j 到 k 的最佳线的误差，然后 Opt(n) = e(i, n) + C + Opt(i − 1)"

然后是该算法的伪代码，我不明白。我明白我们想要遍历点列表并找到最小化 OPT(n) 公式的点，但我只是不遵循它，这让我感觉很愚蠢，

我知道这个问题很痛苦。回答这个问题并不容易，但我只是在寻找一些理解该算法的指导，我对 PDF 表示歉意，但我没有更好的方法来向读者提供关键信息，

感谢您花时间阅读本文。，我很欣赏。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

生生漫 2024-10-07 09:15:02

最小二乘问题要求找到一条最适合给定点的直线，并且有一个很好的封闭形式解决方案。该解用作求解分段最小二乘问题的原语。

在分段最小二乘问题中，我们可以有任意数量的线段来拟合给定的点，并且我们必须为使用的每个新线段支付成本。如果使用新线段的成本为 0，我们可以通过在每个点上传递一条单独的线来完美地拟合所有点。

现在假设我们试图找到最适合 n 个给定点的线段集。如果我们知道 n-1 个子问题的最佳解决方案：第一个点的最佳拟合，前 2 个点的最佳拟合，...，前 n-1 个点的最佳拟合，那么我们可以计算原始问题的最佳解决方案n 个点如下：

第 n 个点位于单个线段上，该线段从某个较早的点 i 开始，对于某些 i = 1, 2, ..., n。我们已经解决了所有较小的子问题，即少于 n 个点，因此我们可以找到 n 个点的最佳拟合，从而最小化：

前 i-1 点的最佳拟合成本（已计算）+
最适合点 i 到 n 的单线成本（使用最小二乘问题）+
使用新线段的成本

最小化上述数量的 i 值给我们提供了解决方案：使用子问题 i-1 和通过点 i 到 n 的线段的最佳拟合。

如果您需要更多帮助，我已经编写了该算法的解释并在此处提供了 C++ 实现：

回复收藏 0 原文

夢归不見 2024-10-07 09:15:02

棘手的部分，即动态编程部分，是

for j = 1 to n
    M[j] = min_(1=<i=<j) ( e(i,j) + C + M[i-1])

之前完成的 M[0] = 0 部分，n 是数据点的总数。另外，下划线意味着后面括号中的部分应该加下标。教授很可能使用 OPT 而不是 M，这是在其他一些大学关于此的讲座中完成的，您可以在网上找到（并且看起来几乎相同）。现在，如果您查看上面的代码块和讲座中的代码块，您会发现差异。我使用了 M[i-1]，你的教授使用了 M[j-1]。这是你教授的伪代码中的一个拼写错误。您甚至可以回顾之前的幻灯片，看看他在错误函数中的正确性。

基本上，对于每个 j，您要找到从点 i 绘制一条到 j 的线，这样它的误差加上添加这条额外线的成本 (C)，加上使所有线段达到 i 的成本 (已经被最佳选择的）被最小化。

另外，请记住 e(i,i) = 0 以及 e(i,i+1) 因为将一条线拟合到一个点不会产生错误，并且只剩下两点。

The tricky part, the dynamic programming part, is the section

for j = 1 to n
    M[j] = min_(1=<i=<j) ( e(i,j) + C + M[i-1])

where M[0] = 0 is done earlier and n is the total number of data points. Also, the underscore means that the section in parentheses afterwards should be subscripted. The professor could well have used OPT instead of M, and this is done in other some other universities' lectures about this which you can find on the web (and which look almost identical). Now, if you look at my code block above and that in the lecture, you will notice a difference. I used M[i-1] and your professor used M[j-1]. It's a typo in your professor's pseudocode. You can even look back to the slide before and see that he had it correct in the error function there.

Basically, for each j, you are finding the point i to draw a line to j from such that the error of it, plus the cost of adding this extra line (C), plus the cost of making all line segments up to i (which has already been chosen optimally) is minimized.

Also, remember that e(i,i) = 0 as well as e(i,i+1) because fitting a line to a point gives no error as well as to just two points.

回复收藏 0 原文

早茶月光 2024-10-07 09:15:02

从点 1 开始，直到点 j 的最佳解决方案必须在最后一条线段中包含新的端点 j，因此问题变成我必须在哪里放置最后一个分割以最小化最后一条线的成本 -部分？

幸运的是，成本是根据您试图解决的同一问题的子问题来计算的，幸运的是，当您移动到下一个点 j 时，您已经解决了这些较小的子问题。因此，在新点 j 处，您尝试在点 1 和 j 之间找到最佳点 i，以分割出包含 j 的新线段，并最小化成本：optimal_cost_up_to(i) + cost_of_split + cost_of_lsq_fit(i+1) ,j)。现在令人困惑的部分是，在任何时候，您似乎都只是找到一个分割，但实际上所有先前的分割都是由 optimization_cost_up_to(i) 确定的，并且因为您已经计算了所有这些点的最佳成本直到 j，那么您只需要记住答案，这样算法就不必在每次前进一点时重新计算这些成本。

在Python中，我可能会使用字典来存储结果，尽管对于这种动态编程算法，数组可能会更好...

无论如何...

    def optimalSolution(points,split_cost)
        solutions = {0:{'cost':0,'splits':[]}}
        for j in range(1,len(points)):
            best_split = None
            best_cost = lsqFitCost(points,0,j)
            for i in range(0,j):
                cost = solutions[i]['cost'] + split_cost + lsqFitCost(points,i+1,j)
                if cost < best_cost:
                   best_cost = cost
                   best_split = i
            if best_split != None:
                solution[j] = {'cost':best_cost,'splits':solution[best_split]['splits']+[best_split]}
            else:
                solution[j] = {'cost':best_cost,'splits':[]}
        return solutions

它不漂亮，而且我还没有检查它（可能存在涉及以下错误）不拆分是最佳成本的情况），但希望它能帮助您走上正确的道路？请注意，lsqFitCost 必须在每次迭代中做很多工作，但是对于像这样的最小二乘线拟合，您可以通过维护计算中使用的运行总和来使成本大大降低......您应该 Google 最小二乘线拟合更多信息。这可以使 lsqFitCost 恒定，因此总时间将为 O(N^2)。

Starting from point 1, the best solution up until a point j, must include the new end-point j in the last line segment, so the problem becomes where do I have to place the last split to minimize the cost of this last line-segment?

Fortunately the cost is calculated in terms of subproblems of the same problem you are trying to solve, and fortunately you've already solved these smaller subproblems by the time you've moved to the next point j. So at the new point j you are trying to find an optimal point i, between points 1 and j, to split off a new line segment that includes j, and minimizes the cost: optimal_cost_up_to(i) + cost_of_split + cost_of_lsq_fit(i+1,j). Now the confusing part is that at any point it might seem like you are just finding a single split, but in reality all the previous splits are determined by optimal_cost_up_to(i), and since you've already calculated the optimal cost for all these points leading up to j, then you just need to memoize the answers so that the algorithm doesn't have to recalculate these costs each time it advances a point.

In python I'd probably use a dictionary to store the results, although for this dynamic programming algorithm an array might be better...

anyways...

    def optimalSolution(points,split_cost)
        solutions = {0:{'cost':0,'splits':[]}}
        for j in range(1,len(points)):
            best_split = None
            best_cost = lsqFitCost(points,0,j)
            for i in range(0,j):
                cost = solutions[i]['cost'] + split_cost + lsqFitCost(points,i+1,j)
                if cost < best_cost:
                   best_cost = cost
                   best_split = i
            if best_split != None:
                solution[j] = {'cost':best_cost,'splits':solution[best_split]['splits']+[best_split]}
            else:
                solution[j] = {'cost':best_cost,'splits':[]}
        return solutions

it's not pretty, and I haven't checked it (there might be an error involving the case where no split is the best cost), but hopefully it'll help get you on the right path? Note that lsqFitCost has to do a lot of work on each iteration, but for a least squares line fit like this you can make this cost a lot less by maintaining running sums used in the calculation... you should Google least squares line fitting for more info. This could make lsqFitCost constant so the overall time would be O(N^2).

回复收藏 0 原文

沉溺在你眼里的海 2024-10-07 09:15:02

动态编程的基本前提是递归地优化问题（降低“成本”或在本例中为错误），同时存储子问题的成本，这样它们就不会被重新计算（这称为记忆化）。

有点晚了，所以我不会说得太详细，但似乎你遇到最多问题的部分是核心 DP 本身。由于“分段”质量，这里需要 DP。正如您的 PDF 所示，通过最小二乘法找到最佳拟合线很简单，并且不需要 DP。

Opt(n) - e(i, n) + C + Opt(i-1) --- 我们的成本函数其中
C 是新线段的恒定成本（没有它，问题就微不足道，我们只需为每两个点创建新线段）
e(i, n) 是一个线段中 i 到 n 点的误差（非递归）
opt(i-1)是从第一个点递归到第(i-1)个点的最小成本

算法

确保点列表按 x 值排序
M[0] = 0 --- 不言自明
对于所有 i < 的对 (i, j) j：找到 e(i,j) ---- （这将需要嵌套的 for/foreach 循环，或理解式结构。将这些值存储在二维数组中！）
对于 (j=1..n)：
M[j] = min([Opt(j) for i in range(1,j)]

所以基本上，找到任意两个可能点之间的单行成本，存储在 e
对于 j 在 1 和 n 之间的情况，求 j 之前的最小成本。沿途的值将有助于以后的计算，因此请存储它们！请注意，i 也是 Opt 的参数。
M[n] 是总优化成本。

问您的问题 - 如何确定分段发生的位置？一切结束后，如何使用它和 M 来确定线段集？

希望这有帮助！

回复收藏 0 原文

擦肩而过的背影 2024-10-07 09:15:02

以下是分段最小二乘问题的动态规划的公式：

这里，M[j] 表示在点上拟合的最小误差（回归）线i 到 j，我们可以通过保留后向指针数组和动态编程数组来以最小误差 (MSE) 跟踪起始点 i。另外，c 表示绘制一条线的成本（作为对适合的线数的惩罚）。最优子结构性质由贝尔曼方程确定。

这是我在以下带有点 (xs, ys) 的 2D 数据集上对上述 DP 算法的 python 实现（散点图如下）：

def ls_fit(xs, ys, m):
    a = (m*sum(xs*ys)-sum(xs)*sum(ys)) / (m*sum(xs**2)-sum(xs)**2)
    b = (sum(ys)-a*sum(xs)) / m
    return a, b

def compute_errors(xs, ys):
    n = len(xs)
    e = np.zeros((n,n))
    for j in range(n):
        for i in range(j+1):
            m = j-i+1
            if m > 1:
                a, b = ls_fit(xs[i:i+m], ys[i:i+m], m)
                e[i,j] = sum((ys[i:i+m] - a*xs[i:i+m] - b)**2)
    return e

def build_DP_table(e, n):
    M = np.zeros(n)
    p = np.zeros(n, dtype=int) # backpointers
    for j in range(1, n):
        cost = [e[i,j] + c + M[i-1] for i in range(j)]
        M[j] = np.min(cost)
        p[j] = np.argmin(cost)
    return M, p

现在绘制通过动态规划公式获得的最小二乘线段：

c = 10
tol = 2
starts = np.unique(p)
drawn = set([])
plt.plot(xs, ys, 'g.')
for start in starts:
    indices = np.where(abs(p-start) < tol)[0]
    a, b = ls_fit(xs[indices], ys[indices], len(indices))
    if not (a, b) in drawn:
        plt.plot([xs[min(indices)],xs[max(indices)]], [a*xs[min(indices)]+b, a*xs[max(indices)]+b], linewidth=3, 
                 label='line: ({:.2f}, {:.2f})'.format(a,b))
        drawn.add((a,b))
plt.legend()

正如预期的那样，DP 找到了 3 条最佳最小二乘线 (L=3) 拟合数据。

Here is how the dynamic programming for the segmented least squares problem is formulated:

Here, M[j] represents the minimum error (regression) line fitted on the points i to j, we can track the starting point i with minimum error (MSE) by keeping a back pointer array along with the dynamic programming array. Also, c denotes the cost to draw a line (acts as a penalty on number of lines fit). The optimal substructure property is by Bellman equation.

Here is my python implementation for the above DP algorithm, on the following 2D dataset with points (xs, ys) (scatter plotted below):

def ls_fit(xs, ys, m):
    a = (m*sum(xs*ys)-sum(xs)*sum(ys)) / (m*sum(xs**2)-sum(xs)**2)
    b = (sum(ys)-a*sum(xs)) / m
    return a, b

def compute_errors(xs, ys):
    n = len(xs)
    e = np.zeros((n,n))
    for j in range(n):
        for i in range(j+1):
            m = j-i+1
            if m > 1:
                a, b = ls_fit(xs[i:i+m], ys[i:i+m], m)
                e[i,j] = sum((ys[i:i+m] - a*xs[i:i+m] - b)**2)
    return e

def build_DP_table(e, n):
    M = np.zeros(n)
    p = np.zeros(n, dtype=int) # backpointers
    for j in range(1, n):
        cost = [e[i,j] + c + M[i-1] for i in range(j)]
        M[j] = np.min(cost)
        p[j] = np.argmin(cost)
    return M, p

Now plot the least square line segments obtained with the dynamic programming formulation:

c = 10
tol = 2
starts = np.unique(p)
drawn = set([])
plt.plot(xs, ys, 'g.')
for start in starts:
    indices = np.where(abs(p-start) < tol)[0]
    a, b = ls_fit(xs[indices], ys[indices], len(indices))
    if not (a, b) in drawn:
        plt.plot([xs[min(indices)],xs[max(indices)]], [a*xs[min(indices)]+b, a*xs[max(indices)]+b], linewidth=3, 
                 label='line: ({:.2f}, {:.2f})'.format(a,b))
        drawn.add((a,b))
plt.legend()