当前位置：文江博客话题详情

确定一组数据是来自线性函数还是对数函数？

发布于 2024-12-01 12:15:11 字数 205 浏览 4 评论 0原文

我有一组数据点，很好奇这些数据代表线性函数还是对数函数。

数据集是二维的。

假设函数 f(x) = x 遵循一组理想的数据点。如果我绘制数据点，我就能看出它是线性的。

同样，如果数据点遵循函数 f(x) = log(x)，我将能够直观地看出它是对数的。

另一方面，让程序确定一组数据是线性的还是对数的并非易事。我该如何处理这个问题？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

作业与我同在 2024-12-08 12:15:11

一种选择是对数据集进行线性回归以获得最佳拟合线。如果数据是线性的，您将得到非常好的拟合，并且均方误差应该很低。否则，您将得到一个合适的拟合值和一个合理的误差。

或者，您可以考虑通过转换每个点 (x₀, x₁, ..., x_n, y) 来转换数据集到 (x₀, x₁, ..., x_n, e^y)。如果数据是线性的，现在它将是指数的，如果数据是对数的，现在它将是线性的。现在运行线性回归并获取均方误差，对数数据的误差很小，而线性数据的误差却大得惊人，因为指数函数膨胀得非常快。

要实际实现回归，一种选择是使用最小二乘回归。除了模型之外，这还有一个额外的好处，即为您提供相关系数，该系数也可用于区分两个数据集。

因为您询问了如何在 Java 中执行此操作，所以快速 Google 搜索出现了这段 Java 代码用于执行线性回归。然而，您可能更适合像 Matlab 这样专门针对执行此类查询进行优化的语言。的一行代码来完成此回归。

linearFunction = inputs / outputs

例如，在 Matlab 中，您可以通过编写“希望这有帮助！”

One option would be to do a linear regression on the data set to get a best-fit line. If the data is linear, you'll get a very good fit and the mean squared error should be low. Otherwise, you'll get an okay fit and a reasonable error.

Alternatively, you could consider transforming the data set by converting each point (x₀, x₁, ..., x_n, y) to (x₀, x₁, ..., x_n, e^y). If the data was linear, now it will be exponential, and if the data was logarithmic, now it will be linear. Running a linear regression and getting the mean-squared error now will have a low error for the logarithmic data and a staggeringly huge error for the linear data, since the exponential function blows up extremely quickly.

To actually implement the regression, one option would be to use a least-squares regression. This would have the added benefit of giving you a correlation coefficient in addition to the model, which could also be used to distinguish between the two data sets.

Because you've asked for how to do this in Java, a quick Google search turned up this Java code to do a linear regression. However, you might have a better fit in a language like Matlab that is specifically optimized to do these sorts of queries. For example, in Matlab, you can do this regression in one line of code by writing