聚合/外推多维空间中的稀疏样本

发布于 2024-11-18 02:51:14 字数 684 浏览 7 评论 0原文

想象一下,有一个函数可以在给定浮点 x 和 y 坐标(以及根据维度的附加分量)的情况下评估表面的高程:

double ComputeElevation(double x, double y, ...., double z) { }

这不是解析函数,因此无法计算导数。我需要做的是找到任何给定的 {x, y} 对的表面最陡的方向。一次评估可能非常昂贵(在最坏的情况下考虑几秒钟甚至几分钟)。

在 2D 情况下,我的典型方法是在与 {x, y} 相邻的 N 个位置对表面进行采样,然后通过这些样本拟合一条曲线并在曲线中搜索最高点,因为此搜索不会受到昂贵的评估的影响:

2D 采样算法示例

在上图中,P0 是给定的坐标。 {S0, S1, S2, S3} 是 P0 周围随机放置的 4 个样本,PM 是曲线上的最高点。因此,矢量 PM-P0 是最陡上升的方向。

但我不知道如何将其扩展到 N 维,也不知道是否有更智能的算法可以做到这一点。

维度的数量可能相当大(几十到几百),因此无论我最终使用什么方法,都必须在样本少于维度的情况下起作用。我不是在寻找确切的答案,那是不可能的,但半途而废的近似值已经是最受欢迎的。

附注我在 C# 中执行此操作,这并不是很重要,但我无法访问非 C# 语言功能。

Imagine there is a function which evaluates the elevation of a surface given a floating point x and y coordinate (and additional components in accordance with dimensionality):

double ComputeElevation(double x, double y, ...., double z) { }

This is not an analytic function and thus the derivatives cannot be computed. What I need to do is find the direction in which the surface is steepest for any given {x, y} pair. A single evaluation could be very expensive (think seconds or even minutes in worst case scenarios).

My typical approach in the 2D case would be to sample the surface in N locations adjacent to {x, y}, then fit a curve through those samples and search the curve for the highest point, as this search does not suffer from expensive evaluation:

Example of Sampling Algorithm in 2D

In the above image P0 is the given coordinate. {S0, S1, S2, S3} are 4 random-ishly placed samples around P0 and PM is the highest point on the curve. Thus, the vector PM-P0 is the direction of steepest ascent.

But I have no idea how to scale this up to N dimensions, or whether there are much smarter algorithms for doing this.

The number of dimensions is potentially quite large (dozens to hundreds), so whatever method I end up using must work when there are fewer samples than there are dimensions. I'm not looking for the exact answer, that would be impossible, but a half-way decent approximation would already be most welcome.

ps. I'm doing this in C#, not that it matters a great deal but I don't have access to non-C# language features.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

墟烟 2024-11-25 02:51:14

看起来您正在尝试根据给定点附近的一组随机样本来估计梯度。

不幸的是,如果您有 n 个维度,则至少需要 n+1 个点才能正确估计梯度。如果点较少,则需要删除维度,并且您将估计梯度的较低维度投影。也就是说,如果您捕获 k 维度,您的项目很可能会获得真实渐变长度的 sqrt(k/n)

这是一种方法。假设您在您的点周围采样了 k+1 个随机点,并进一步假设它们是线性无关的。选择其中之一作为您的“原点”,然后您将拥有 k 维度。找到与之前所有点正交的另外 nk 个点,并输入原点的值。 (这将导致这些维度无法在渐变中表示。)

现在您有 n 个向量以及每个向量的梯度点积的估计值。取每个标准单位向量,并将其写为向量的线性组合。斜率的相同线性组合将为您提供该梯度分量的估计值。对所有单位向量执行此操作,将它们加在一起,瞧,您就得到了梯度的估计值。

请注意,如果您尝试使用一些近点和一些远点,其中一些不是线性独立的,那么这种方法将不起作用,您需要做一些更复杂的事情。

It looks like you are trying to estimate the gradient from a set of random samples near a given point.

Unfortunately if you have n dimensions, you need a minimum of n+1 points to properly estimate the gradient. With fewer points, dimensions need to drop out, and you'll be estimating a lower dimensional projection of the gradient. That said, if you capture k dimensions, odds are that your project will get sqrt(k/n) of the length of the true gradient.

Here is one approach. Suppose that you have sampled k+1 random points around your point and further assume that they are linearly independent. Pick one of them as your "origin", and then you'll have k dimensions. Find another n-k more points that are orthogonal to all of the previous ones, and put in your origin's value. (That will cause those dimensions to not be represented in the gradient.)

Now you have n vectors and an estimate of the dot product of the gradient with each. Take each standard unit vector, and write it as a linear combination of your vectors. The same linear combination of your slopes will give you an estimate for that component of the gradient. Do this for all of your unit vectors, add them together, and voila, you have an estimate of your gradient.

Please note that if you are trying to use some near, and some far points, some of which are not linearly independent, then this approach won't work and you'll need to do something much more complicated.

无戏配角 2024-11-25 02:51:14

我不完全清楚为什么计算曲线比随机采样点便宜,但这让我想起 http: //en.wikipedia.org/wiki/Gradient_descent。您可以将您的问题视为尝试优化当前位置和新点之间的海拔差异。它可能比尝试随机点更快,并且很容易推广到多个维度。

由于该函数可能是非单调递增的,因此您可能需要根据边界(在起点的 x 单位内)来定义它。

I'm not totally clear why computing the curve is cheaper than sampling points at random, but this reminds me of http://en.wikipedia.org/wiki/Gradient_descent. You can think of your problem as trying to optimize the difference of the elevation between the current location and a new point. It may be faster than trying random points and it's really easy to generalize to multiple dimensions

Since the function is probably non-monotonically increasing, you might want to define it in relationship to a boundary (within x units of the starting point).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文