gnuplot:从原始数据创建箱线图

发布于 2024-11-28 19:19:01 字数 205 浏览 1 评论 0原文

gnuplot 可以从原始数据文件创建箱线图吗?我知道如何根据已经计算的中位数、四分位数等绘制箱线图 像这样< /a> - 但是如何从原始数据文件中获取?

原始数据文件的每一行都有一个测试结果。

can gnuplot create a boxplot from a raw data file? I know how I can plot a boxplot from the already calculated median, quartiles and so on like this - but how from a raw data file?

In each line of the raw data file there is one test result.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

星軌x 2024-12-05 19:19:01

我自己刚刚遇到了这个,在gnuplot 4.5(当前cvs开发版本)中它有这个功能。

目前,这意味着您必须自己从源代码编译 gnuplot,http://gnuplot.sourceforge。 net/development/index.html#DownloadCVS

完成后,这是一个演示文件: http://gnuplot.sourceforge.net/ demo_4.5/boxplot.html

Just came across this myself, in the gnuplot 4.5 (cvs development version currently) it has this feature.

Currently that means you have to compile gnuplot yourself from the sources, http://gnuplot.sourceforge.net/development/index.html#DownloadCVS.

Once you get that done, here's a demo file: http://gnuplot.sourceforge.net/demo_4.5/boxplot.html

雨巷深深 2024-12-05 19:19:01

我认为你最终必须使用外部程序来计算箱线图所需的数据。我使用过 awk,但任何程序都可以在该地方使用。请注意,我计算了每行原始数据的开盘/收盘/最小值/最大值,而不是平均值和分位数。

set xrange [-1:9]
plot "< awk '{sum=0; opening=$1; closing=$NF; min=$1; max=$1; \
              for (i=1; i<=NF; i++) {sum=sum+$i; if ($i<min) min=$i; if ($i>max) max=$i}; \
              print sum/NF, opening, closing, min, max}' \
        junk.dat" us 0:2:4:5:3 w candle notitle

junk.dat 文件中包含以下数据:

   5.532    5.040    4.962   19.314    5.136
  10.004    4.592    5.836    6.999    7.823
   8.887    6.335    5.545    5.056    6.216
   4.341    4.552    4.512    4.009    5.811
   4.724    4.869    5.016    2.593    5.662
   4.555    5.472    4.866    5.559   -0.608
   6.974    3.838    2.953    6.630    2.753
   5.571    8.112    3.261    7.029    4.375
   3.497    5.200    6.555    5.311    8.204

这是您将得到的图:

I think you have to end up use external program to calculate the necessary data for box plot. I've used awk, but any program can be used in the place. Note that I've calculated opening/closing/minimum/maximum values in each line of raw data, instead of mean and quantiles.

set xrange [-1:9]
plot "< awk '{sum=0; opening=$1; closing=$NF; min=$1; max=$1; \
              for (i=1; i<=NF; i++) {sum=sum+$i; if ($i<min) min=$i; if ($i>max) max=$i}; \
              print sum/NF, opening, closing, min, max}' \
        junk.dat" us 0:2:4:5:3 w candle notitle

With the following data in junk.dat file:

   5.532    5.040    4.962   19.314    5.136
  10.004    4.592    5.836    6.999    7.823
   8.887    6.335    5.545    5.056    6.216
   4.341    4.552    4.512    4.009    5.811
   4.724    4.869    5.016    2.593    5.662
   4.555    5.472    4.866    5.559   -0.608
   6.974    3.838    2.953    6.630    2.753
   5.571    8.112    3.261    7.029    4.375
   3.497    5.200    6.555    5.311    8.204

Here's the plot you will get:

enter image description here

流年已逝 2024-12-05 19:19:01

如果我正确理解你的问题,并且你正在寻找一种计算平均值的方法,你可以这样做:

calc_mean(x1,x2,x3) = (x1+x2+x3)/3
calc_sum(x1,x2,x3)  = x1+x2+x3
get_min(x1,x2,x3)   = x1 < x2 ? (x1 < x3 ? x1 : (x2 < x3 ? x2 : x3)) : (x2 < x3 ? x2 : x3)
get_max(x1,x2,x3)   = x1 > x2 ? (x1 > x3 ? x1 : (x2 > x3 ? x2 : x3)) : (x2 > x3 ? x2 : x3)

plot "Data.csv" u 0:(calc_mean($1, $2, $3)) t "Mean" w l, \
         "" u 0:(calc_sum($1, $2, $3)) t "Sum" w l, \
         "" u 0:(get_min($1, $2, $3)) t "Min" w l, \
         "" u 0:(get_max($1, $2, $3)) t "Max" w l

上面的脚本计算数据行的平均值、总和、最小值和最大值。 using 指令中的 0 只是将数据行的索引作为 x 坐标值。

使用以下 Data.csv

0.62614   0.50293   0.62078
0.63789   0.58924   0.71288
0.16297   0.77453   0.82417
0.20703   0.22424   0.33596
0.57829   0.96545   0.60737

您将得到以下图:

上述脚本的图

我希望这就是您正在寻找的。

If I understand your question correctly and you are looking for a way to calculate the mean value you could do something like this:

calc_mean(x1,x2,x3) = (x1+x2+x3)/3
calc_sum(x1,x2,x3)  = x1+x2+x3
get_min(x1,x2,x3)   = x1 < x2 ? (x1 < x3 ? x1 : (x2 < x3 ? x2 : x3)) : (x2 < x3 ? x2 : x3)
get_max(x1,x2,x3)   = x1 > x2 ? (x1 > x3 ? x1 : (x2 > x3 ? x2 : x3)) : (x2 > x3 ? x2 : x3)

plot "Data.csv" u 0:(calc_mean($1, $2, $3)) t "Mean" w l, \
         "" u 0:(calc_sum($1, $2, $3)) t "Sum" w l, \
         "" u 0:(get_min($1, $2, $3)) t "Min" w l, \
         "" u 0:(get_max($1, $2, $3)) t "Max" w l

The script above calculates the mean, the sum, the minimum and the maximum value of a data line. The 0 in the using directive simply takes the index of the data line as the x-coordinate value.

With the following Data.csv:

0.62614   0.50293   0.62078
0.63789   0.58924   0.71288
0.16297   0.77453   0.82417
0.20703   0.22424   0.33596
0.57829   0.96545   0.60737

You will get the following plot:

Plot of the script above

I hope this is what you were looking for.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文