在java中计算皮尔逊相关性时需要一些指向正确的方向
我正在尝试计算制表符分隔文本文件中 13 个变量之间的 Pearson 相关性,其中每列都是一个变量。我正在使用 java,希望有人能给我一些关于我应该使用哪些库或哪些函数的指导。我猜我首先需要读取文件的内容,但无法弄清楚如何从本质上使程序知道每一列都像一个数组,这将使我能够进行所需的计算。我本以为 java.io 包将是我开始的最佳位置,但只是无法弄清楚我可以使用哪些类来解决我的问题。我还查看了 http://commons.apache.org/math/ 它有一个函数测量皮尔逊相关性,但这太容易了,因为这是一项大学作业,我必须从头开始实现它。通过观察阿帕奇皮尔逊相关性,他们似乎已经像矩阵一样解决了这个问题,其中矩阵的每一列都是一个变量。
很抱歉对我的问题进行了冗长的描述。如果你们知道任何网站或任何好的关键词来搜索或任何其他信息,我将不胜感激。谢谢,阿林德。
I am trying to calculate Pearsons correlation between 13 variables in a tab delimited text file where each column is a variable. I am using java and was hoping that somebody can give me some guidance as to which libraries or which functions I should be using. I am guessing I will first need to read the contents of the file but can't figure out how to essentially make the program know that each column is an like an array which would than enable me to do my required calculations. I would have thought the java.io package would be the best place for me to start but just can't figure out what classes I could use for my problem. I have also looked at http://commons.apache.org/math/ which has a function for measuring pearsons correlation but that would be too easy and as this is a Uni assignment I have to implement it form scratch. By looking at the appache pearsons correlation they seem to have approached the problem like a matrix where each column of the matrix is a variable.
Sorry for the lengthy description of my problem. If you guys know any websites or any good kewords to search for or any other information I would greatly appreciate. Thanks, Arlind.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您应该能够仅使用标准 java Math、String、File I/O 库以及一些数组和循环来完成此操作!
首先阅读本文以了解如何读取文件。
http://www.roseindia.net/java /beginners/java-read-file-line-by-line.shtml
在循环内使用 String.split(String regex) 方法解析 csv 文件。例如 strLine.split(",")。
通过对 String[] 中的每个字符串使用 Double.parseDouble 将其转换为双精度数组
从那里您可以使用
Math.sqrt(double a) 和
Math.pow(double a, double b) 函数以及一些简单的循环来计算每对变量的相关性。
希望这些信息足以帮助您入门,如果您需要更多帮助,请随时回复!
You should be able to do this using just the standard java Math, String, File I/O libraries, and a few arrays and loops!
Read this first to learn how to read in the file.
http://www.roseindia.net/java/beginners/java-read-file-line-by-line.shtml
Inside the loop parse your csv file by using the String.split(String regex) method. e.g. strLine.split(",").
Convert this to an array of doubles, by using Double.parseDouble for each String in the String[]
From there you can use the
Math.sqrt(double a) and
Math.pow(double a, double b) functions along with some simple loops to calculate your correlation for each pair of variables.
Hopefully that's enough info to get you started, feel free to post back if you want more help!