哪种多元插值方法最适合实际使用？

发布于 2024-07-14 09:52:14 字数 362 浏览 5 评论 0原文

在 Peter Alfred 关于多元离散数据插值的文章中，他提到，从各种方案来看，只有少数方案真正受到从业者的欢迎。例如，他将 Shepard 法和 Hardy Multiquadrics 命名为“Shepard 法”和“Hardy Multiquadrics”。但那篇文章到现在已经快20年了，真正有趣的是现在广泛使用的方法。

如果您有使用某些空间插值方案的经验，请告诉我们。

UPD：为了使这个问题更具竞争力，我重申了它。问题是“你用过哪些多元插值方法？”

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

明媚殇 2024-07-21 09:52:14

（这会很长，除非我精疲力竭。）

首先，关于非分散数据的一些评论。（请参阅引用littleCMS的答案）

有两种常见的颜色插值类型。几年前，三线性插值（张量积线性插值）是颜色表插值的常用方法。三线性插值确实可以实现为一组连续的一维插值，首先在一个轴上，然后沿着第二个轴，等等。

很多年前，我们都意识到，当三线性插值应用于某些类型的彩色成像时，会在彩色成像中引入伪影。转变。问题出现在中立者身上。解决方案是通过将立方体分割成 6 个四面体，在 3 维中转向单纯插值。在 n 维中，单位立方体将被分解为阶乘（n）个单纯形。立方体还有其他剖析，但这种特殊的样式确保主对角线始终是所有单纯形的共享边。当应用于某些颜色查找表时，这反过来会恢复中性色的良好行为。

现在让我来谈谈真正的分散数据插值的问题。

其他人提到了各种计划。克里金法、多重二次曲面、基于距离的方法都是其中的一些。（当我过去使用这些方案进行一些工作时，我实际上更喜欢逆多重二次方法。）所有这些实际上只是径向基函数方法（一种常见方案）的变体。 RBF 方法有其优点和缺点。它们通常会生成平滑插值，这当然取决于所选的具体基函数，以及您是否选择限制支持。 RBF 方法还允许您进行推断，至少可以扩展到径向基单元的支持范围内。如果允许基本元素的范围是无限的，则不会应用对外推法的明确约束。（外推通常是一件坏事。）RBF 方法的一个问题是它们需要求解大型线性方程组，而这些方程组通常是稠密矩阵。这意味着问题的规模（就您可以处理的数据点的数量而言）往往受到线性代数的限制。相反，如果通过截断基本元素来限制支持，则矩阵可能会变得稀疏。如果您使用稀疏矩阵包来求解，这将改进线性代数。同时，支撑距离成为您必须控制的非线性参数。同样，诸如多重二次函数和逆多重二次函数之类的方法可能具有控制基本元素形状的辅助非线性参数。克里金法也有类似的问题，我会将所有这些方法放在一起。

针对这些问题，我归类为 RBF 变体的所有这些方法通常都受到它们能够轻松处理的点数的限制。根据您处理事物的方式和可用内存量，该限制通常可能约为几千点。

一般 RBF 方法类别的另一个问题是我所说的内推法。这是我多年前创建的一个新词，用于描述数据中相对较大的漏洞的插值。事实上，即使在数据中较小的漏洞上进行插值，也经常会出现问题。这些方法，因为它们在某种程度上是平滑的，可能会将不需要的极值（大峰或谷）引入插值表面。即使是一维插值，这也是一个常见问题，通常被视为三次样条或多项式插值的振铃伪影，并且在傅里叶级数插值中当然也会出现。在更高维度中的问题是要认识到它确实发生了，因为在三个以上的维度上绘制表面往往很困难。

如果您的点数超过该限制，或者这些振铃伪像不可接受，那么其他方法通常是更好的选择。如果您愿意使用线性插值，那么高维中最简单的解决方案就是从数据的曲面细分开始。因此，在三维空间中，将数据（通常是 delaunay 曲面细分）细分为四面体。这是相当有效的，并且可以找到许多用于此目的的工具。那么插入任何单个点就是一个简单的问题。只需识别该点位于哪个单纯形，计算重心坐标作为单纯形内的插值权重，并在找到的单纯形的每个顶点处形成函数值的相应线性组合。这一切都非常快速和高效。

这些基于曲面细分的方法的缺点是它们通常将您限制在数据点的凸包上，而且更糟糕的是，如果您的数据碰巧位于非凸域中，那么插值很可能会在某些区域中做出奇怪的事情。你的域名。我上面提到的方案的另一个问题是插值只能是分段线性的，但是一旦进入更高的维度，事情就会变得非常快。还可以找到其他方法来基于曲面细分进行平滑插值，但它们需要付出更多努力，因此不太常见。

基本的权衡在这里应该是显而易见的。如果需要平滑插值并且只有几个点，那么通常会选择 RBF 方法。它们具有简单、易于使用等优点。实际选择的方法通常只是为了方便，甚至是习惯。如果我以前使用过一种工具并且很高兴，那么我可能会再次对它感到满意。由于问题是哪种方法“最适合实际使用”，我将指出，在脱离上下文应用时，“最佳”是一个非常主观的词。您在插值问题中的目标是什么？您拥有什么技能？您知道如何使用哪组工具？您将在什么环境中工作？所有这些因素都会影响您对最佳方法的选择。

如果您有很多数据点，并且速度至关重要，但最终平滑度并不那么重要，那么您通常会寻找单纯插值法。当然，如果你有足够的点数，那么野兽的分段线性性质就不那么重要了。这里的分段线性插值在某些情况下具有很大的优点，它永远不会在表面中生成数据中不存在的极值。对于某些问题，例如颜色表征，这是至关重要的。

另一个问题是噪音。虽然噪声的存在通常表明需要进行某种平滑处理，但并非所有此类表面都应用了平滑处理。任何平滑运算符有时也会平滑数据的重要特征。发生这种情况是因为我们可以将平滑算子视为低通滤波器。高频行为通常是噪音，但它也可能只是我表面上的一个尖锐的脚趾或肩膀，我不能失去它。如果这是一个问题，那么即使有时存在明显的噪声，您也可能愿意使用插值。在这种情况下，我建议最简单、最低阶的插值法是最好的。平滑、更全局的插值也会倾向于放大数据中的任何噪声，因此如果您在存在噪声的情况下寻找最低方差插值，它通常是线性插值。

当然，薄板样条有很多种，无论是插补的还是非插补的。一旦你超越了一个维度，你的选择也会扩大，至少如果你愿意做这项工作的话。

在它变成一本书之前我会在这里结束。

(This will get long, unless I just run out of steam.)

First, a few comments about non-scattered data. (See the answer that references littleCMS)

There are two types of color interpolation that is common. Some years ago, trilinear interpolation (tensor product linear interpolation) was a common approach for color table interpolation. Trilinear interpolation can indeed be implemented as a sequential set of one dimensional interpolations, first on one axis, then then along a second axis, etc.

Many years ago we all realized that trilinear interpolation introduces artifacts in color imaging, when applied to certain types of transformations. The problems are seen in neutrals. A solution is to move to a simplicial interpolant, in 3-d, by dissecting a cube into 6 tetrahedra. In n dimensions, the unit cube will be dissected into factorial(n) simplexes. There are other dissections of a cube, but this particular style assures that the main diagonal is always a shared edge for all the simplexes. This in turn restores good behavior for neutrals, when applied to certain color lookup tables.

Now let me get into the question of true scattered data interpolation.

Others have mentioned a variety of schemes. Kriging, multiquadrics, distance based methods are a few. (When I did some work in the past with these schemes, I actually preferred the inverse multiquadric methods.) All of these are really just variations of radial basis function methods, a common scheme. RBF methods have their good and bad points. They will usually generate a smooth interpolant, this of course depends on the specific basis function chosen, as well as whether you choose to limit the support. RBF methods also allow you to extrapolate, at least as far out as the support of the radial basis elements will extend. If the basis elements are allowed to be infinite in extent, then no explicit constraint on extrapolation will apply. (Extrapolation in general is a bad thing to do.) One problem with RBF methods is they require the solution of large systems of linear equations, and those equation systems are often dense matrices. This means the size of the problem, in terms of the number of data points you can handle tends to be limited by the linear algebra. If instead, you limit the support by truncating the basis elements, then the matrices can become sparse. This will improve the linear algebra if you use a sparse matrix package for the solution. At the same time, the support distance becomes a nonlinear parameter that you must control. As well, methods like multiquadrics and inverse multiquadric methods may have a secondary nonlinear parameter that controls the shape of the basis elements. Kriging has similar issues, and I'd lump all of these methods together.

Do to these issues, all of these methods that I've classed as RBF variants are often limited in the number of points they will comfortably handle. Depending upon how you deal with things and the amount of memory available, that limit might often be on the order of a few thousand points.

One other problem with the general class of RBF methods is what I'll call intrapolation. This is a neologism I created many years ago to describe interpolation across a relatively large hole in the data. In fact, there may often be problems even when interpolating across smaller holes in the data. These methods, because they are smooth to some extent, may introduce unwanted extrema (large peaks or valleys) into the interpolated surface. This is a common problem with even 1-d interpolants, often seen as ringing artifacts with cubic spline or polynomial interpolants, and certainly seen with Fourier series interpolants. The problem in higher dimensions is to even recognize that it has indeed happened, since plotting surfaces in more than three dimensions tends to be difficult.

If you have more points than that limit, or if these ringing artifacts are unacceptable, then other methods are often a better choice. If you are willing to use a linear interpolant, then the simplest solution in higher dimensions is to start with a tessellation of the data. Thus in 3 dimensions, tessellate the data (typically a delaunay tessellation) into tetrahedra. This is fairly efficient to do, and there are many tools to be found for this purpose. It is then a simple problem to interpolate any individual point. Merely identify which simplex the point lies in, compute barycentric coordinates as interpolation weights within the simplex, and form the corresponding linear combination of the function values at each vertex of the found simplex. This is all extremely fast and efficient.

A downside of these tessellation based methods is they typically restrict you to the convex hull of the data points, and as bad, if your data happens to lie in a non-convex domain, then the interpolant may well do strange things in some regions of your domain. Another problem with the scheme I mentioned above, is the interpolant will only be piecewise linear, but once you move into higher dimensions things get nasty fast. Other methods are to be found for smooth interpolation based on a tessellation, but they will take more effort and are therefore far less common.

The basic tradeoffs should be obvious here. If you need a smooth interpolant and only have a few points, then RBF methods are often chosen. They are simple, easy to use, etc. The actual method chosen is often just a matter of convenience, or even habit. I'f I've used one tool before and was happy, I'll probably be happy with it again. Since the question was which method is "best for practical use", I'll point out that best is a very subjective word when applied out of context. What are your goals in an interpolation problem? What skill set do you own? What set of tools do you know how to use? What environment will you work in? All of these factors will influence your choice of the best method.

If you have many data points, and speed is of the essence, but ultimate smoothness is not that important, then you will generally look for a simplicial interpolant. Of course, if you have sufficient points, then the piecewise linear nature of the beast is of less importance. The piecewise linear interpolant here has the great virtue in some cases that it can never generate extrema in your surface that did not exist in the data. For some problems, color characterization for example, this is of paramount importance.

One other issue is with noise. While the presence of noise is often a signal that smoothing of some sort is necessary, not all such surfaces have smoothing applied. Any smoothing operator will sometimes smooth out important features of the data too. This happens because we can think of a smoothing operator as a low pass filter. High frequency behavior is often noise, but it may also be just a sharp toe or shoulder in my surface that I cannot afford to lose. If this is a problem, then you may be willing to use an interpolant even in the presence of sometimes significant noise. In that event, I'll suggest that the simplest, lowest order interpolant is best. A smooth, more global interpolant will also tend to amplify any noise in the data, so if you look for the lowest variance interpolant in the presence of noise, it will generally be a linear interpolant.

Of course, there are many varieties of thin plate splines, interpolatory or not. Once you go beyond one dimension, your options also expand, at least if you are willing to do the work.

I'll end here before it turns into a book.

回复收藏 0 原文