确定一组数据是来自线性函数还是对数函数?

发布于 2024-12-01 12:15:11 字数 205 浏览 4 评论 0原文

我有一组数据点,很好奇这些数据代表线性函数还是对数函数。

数据集是二维的。

假设函数 f(x) = x 遵循一组理想的数据点。如果我绘制数据点,我就能看出它是线性的。

同样,如果数据点遵循函数 f(x) = log(x),我将能够直观地看出它是对数的。

另一方面,让程序确定一组数据是线性的还是对数的并非易事。我该如何处理这个问题?

I have a set of data points and am curious if the data represents a linear function or a logarithmic function.

The data set is 2 dimensional.

Let's say an ideal set of data points followed the function f(x) = x. If I plotted the data point I would be able to tell it is linear.

Similarly if the data points followed the function f(x) = log(x), I would be able to visually tell it is logarithmic.

On the other hand, having the program determine if a set of data is linear or logarithmic is nontrivial. How would I approach this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

作业与我同在 2024-12-08 12:15:11

一种选择是对数据集进行线性回归以获得最佳拟合线。如果数据是线性的,您将得到非常好的拟合,并且均方误差应该很低。否则,您将得到一个合适的拟合值和一个合理的误差。

或者,您可以考虑通过转换每个点 (x0, x1, ..., xn, y) 来转换数据集到 (x0, x1, ..., xn, ey)。如果数据是线性的,现在它将是指数的,如果数据是对数的,现在它将是线性的。现在运行线性回归并获取均方误差,对数数据的误差很小,而线性数据的误差却大得惊人,因为指数函数膨胀得非常快。

要实际实现回归,一种选择是使用最小二乘回归。除了模型之外,这还有一个额外的好处,即为您提供相关系数,该系数也可用于区分两个数据集。

因为您询问了如何在 Java 中执行此操作,所以快速 Google 搜索出现了这段 Java 代码用于执行线性回归。然而,您可能更适合像 Matlab 这样专门针对执行此类查询进行优化的语言。 的一行代码来完成此回归。

linearFunction = inputs / outputs

例如,在 Matlab 中,您可以通过编写“希望这有帮助!”

One option would be to do a linear regression on the data set to get a best-fit line. If the data is linear, you'll get a very good fit and the mean squared error should be low. Otherwise, you'll get an okay fit and a reasonable error.

Alternatively, you could consider transforming the data set by converting each point (x0, x1, ..., xn, y) to (x0, x1, ..., xn, ey). If the data was linear, now it will be exponential, and if the data was logarithmic, now it will be linear. Running a linear regression and getting the mean-squared error now will have a low error for the logarithmic data and a staggeringly huge error for the linear data, since the exponential function blows up extremely quickly.

To actually implement the regression, one option would be to use a least-squares regression. This would have the added benefit of giving you a correlation coefficient in addition to the model, which could also be used to distinguish between the two data sets.

Because you've asked for how to do this in Java, a quick Google search turned up this Java code to do a linear regression. However, you might have a better fit in a language like Matlab that is specifically optimized to do these sorts of queries. For example, in Matlab, you can do this regression in one line of code by writing

linearFunction = inputs / outputs

Hope this helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文