绘制直方图:如何使用数据库中存储的数据从头开始绘制直方图?

发布于 2024-08-11 15:52:21 字数 349 浏览 1 评论 0原文

我有一些数据存储在数据库中,如下所示:

TableName:faults 表:

+------------+--------------+
| fault_type | total        |
+------------+--------------+
|    1       |            1 | 
|    2       |            3 | 
|    3       |            8 | 
|    4       |            2 | 
.............................

我该如何从该表开始获得直方图?

I have some data stored in a database like this:

TableName: faults
Table:

+------------+--------------+
| fault_type | total        |
+------------+--------------+
|    1       |            1 | 
|    2       |            3 | 
|    3       |            8 | 
|    4       |            2 | 
.............................

How am I supposed to get a histogram plot starting from this table?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

时光与爱终年不遇 2024-08-18 15:52:21

下面的解决方案假设您有 MySQL、Python 和 GNUPlot。如果需要的话,具体细节可以进行微调。发布它以便它可以成为其他同行的基准。

第 1 步:决定图表的类型。

如果它是某种频率图,那么一个简单的 SQL 查询就可以解决问题:

select total, count(total) from faults GROUP BY total;

如果您需要指定 bin 大小,则继续下一步步。

第 2 步:确保您能够使用 Python 连接到 MySQL。您可以使用 MySQLdb 导入来执行此操作。

之后,生成直方图数据的 python 代码如下(这是在 5 分钟内写完的,所以非常粗糙):

import MySQLdb

def DumpHistogramData(databaseHost, databaseName, databaseUsername, databasePassword, dataTableName, binsTableName, binSize, histogramDataFilename):
    #Open a file for writing into
    output = open("./" + histogramDataFilename, "w")

    #Connect to the database
    db = MySQLdb.connect(databaseHost, databaseUsername, databasePassword, databaseName)
    cursor = db.cursor()

    #Form the query
    sql = """select b.*, count(*) as total 
            FROM """ + binsTableName + """ b 
            LEFT OUTER JOIN """ + dataTableName + """ a 
            ON a.total between b.min AND b.max 
            group by b.min;"""
    cursor.execute(sql)

    #Get the result and print it into a file for further processing
    count = 0;
    while True:
        results = cursor.fetchmany(10000)
        if not results:
            break
        for result in results:
            #print >> output, str(result[0]) + "-" + str(result[1]) + "\t" + str(result[2])
    db.close()

def PrepareHistogramBins(databaseHost, databaseName, databaseUsername, databasePassword, binsTableName, maxValue, totalBins):

    #Connect to the database    
    db = MySQLdb.connect(databaseHost, databaseUsername, databasePassword, databaseName)
    cursor = db.cursor()

    #Check if the table was already created
    sql = """DROP TABLE IF EXISTS """ + binsTableName
    cursor.execute(sql)

    #Create the table
    sql = """CREATE TABLE """ + binsTableName + """(min int(11), max int(11));"""
    cursor.execute(sql)

    #Calculate the bin size
    binSize = maxValue/totalBins

    #Generate the bin sizes
    for i in range(0, maxValue, binSize):
        if i is 0:
            min = i
            max = i+binSize
        else:
            min = i+1
            max = i+binSize
        sql = """INSERT INTO """ + binsTableName + """(min, max) VALUES(""" + str(min) + """, """ + str(max) + """);"""
        cursor.execute(sql)
    db.close()
    return binSize

binSize = PrepareHistogramBins("localhost", "testing", "root", "", "bins", 5000, 100)
DumpHistogramData("localhost", "testing", "root", "", "faults", "bins", binSize, "histogram")

Step # 3:使用GNUPlot生成直方图。您可以使用以下脚本作为起点(生成 eps 图像文件):

set terminal postscript eps color lw 2 "Helvetica" 20
set output "output.eps"
set xlabel "XLABEL"
set ylabel "YLABEL"
set title "TITLE"
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set boxwidth 0.9
set key autotitle columnheader
set xtics rotate by -45
plot "input" using 1:2 with linespoints ls 1

将上述脚本保存到任意文件中,例如 example.script。继续下一步。

步骤 #4:使用 gnuplot 和上述输入脚本生成 eps 文件

gnuplot sample.script

没什么复杂的,但我认为这段代码中的一些位可以重复使用。再说一次,就像我说的,它并不完美,但你可以完成工作:)

鸣谢:

  • Ofri Raviv(帮助我解决
    这篇文章中的 MySQL 查询:
    获取直方图数据

  • 我自己(用于编写 python 和
    gnuplot 脚本 :D)

The solution below assumes that you have MySQL, Python and GNUPlot. The specific details can be fine tuned if necessary. Posting it so that it could be a baseline for other peers.

Step #1: Decide the type of graph.

If it is a frequency plot of some kind, then a simple SQL query should do the trick:

select total, count(total) from faults GROUP BY total;

If you need to specify bin sizes, then proceed to the next step.

Step #2: Make sure you are able to connect to MySQL using Python. You can use the MySQLdb import to do this.

After that, the python code to generate data for a histogram plot is the following (this was written precisely in 5 minutes so it is very crude):

import MySQLdb

def DumpHistogramData(databaseHost, databaseName, databaseUsername, databasePassword, dataTableName, binsTableName, binSize, histogramDataFilename):
    #Open a file for writing into
    output = open("./" + histogramDataFilename, "w")

    #Connect to the database
    db = MySQLdb.connect(databaseHost, databaseUsername, databasePassword, databaseName)
    cursor = db.cursor()

    #Form the query
    sql = """select b.*, count(*) as total 
            FROM """ + binsTableName + """ b 
            LEFT OUTER JOIN """ + dataTableName + """ a 
            ON a.total between b.min AND b.max 
            group by b.min;"""
    cursor.execute(sql)

    #Get the result and print it into a file for further processing
    count = 0;
    while True:
        results = cursor.fetchmany(10000)
        if not results:
            break
        for result in results:
            #print >> output, str(result[0]) + "-" + str(result[1]) + "\t" + str(result[2])
    db.close()

def PrepareHistogramBins(databaseHost, databaseName, databaseUsername, databasePassword, binsTableName, maxValue, totalBins):

    #Connect to the database    
    db = MySQLdb.connect(databaseHost, databaseUsername, databasePassword, databaseName)
    cursor = db.cursor()

    #Check if the table was already created
    sql = """DROP TABLE IF EXISTS """ + binsTableName
    cursor.execute(sql)

    #Create the table
    sql = """CREATE TABLE """ + binsTableName + """(min int(11), max int(11));"""
    cursor.execute(sql)

    #Calculate the bin size
    binSize = maxValue/totalBins

    #Generate the bin sizes
    for i in range(0, maxValue, binSize):
        if i is 0:
            min = i
            max = i+binSize
        else:
            min = i+1
            max = i+binSize
        sql = """INSERT INTO """ + binsTableName + """(min, max) VALUES(""" + str(min) + """, """ + str(max) + """);"""
        cursor.execute(sql)
    db.close()
    return binSize

binSize = PrepareHistogramBins("localhost", "testing", "root", "", "bins", 5000, 100)
DumpHistogramData("localhost", "testing", "root", "", "faults", "bins", binSize, "histogram")

Step #3: Use GNUPlot to generate the histogram. You can use the following script as a starting point (generates an eps image file):

set terminal postscript eps color lw 2 "Helvetica" 20
set output "output.eps"
set xlabel "XLABEL"
set ylabel "YLABEL"
set title "TITLE"
set style data histogram
set style histogram cluster gap 1
set style fill solid border -1
set boxwidth 0.9
set key autotitle columnheader
set xtics rotate by -45
plot "input" using 1:2 with linespoints ls 1

Save the above script into some arbitrary file say, sample.script. Proceed to the next step.

Step #4: Use gnuplot with the above input script to generate an eps file

gnuplot sample.script

Nothing complicated but I figured a couple of bits from this code can be reused. Again, like I said, it is not perfect but you can get the job done :)

Credits:

风吹过旳痕迹 2024-08-18 15:52:21

这篇博客文章可能会对您有所帮助!它讨论了使用 gnuplot 进行统计并将结果绘制成直方图。

This blog article may help you! It talks about statistic using gnuplot and plot the result into histogram.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文