如何提高性能和速度

发布于 2024-08-14 06:46:59 字数 1181 浏览 3 评论 0原文

我编写了这个程序用于连接数据并将数据提取到文件中，但是该程序提取数据的速度非常慢。有什么方法可以提高性能以及更快地将数据加载到文件中。我的目标是大约 100,000 到数百万条记录，这就是我担心性能的原因，而且我是否可以像在 java 中那样使用数组获取大小和批处理大小。

import java.sql as sql
import java.lang as lang
def main():
    driver, url, user, passwd = ('oracle.jdbc.driver.OracleDriver','jdbc:oracle:thin:@localhost:1521:xe','odi_temp','odi_temp')
    ##### Register Driver
    lang.Class.forName(driver)
    ##### Create a Connection Object
    myCon = sql.DriverManager.getConnection(url, user, passwd)
    f = open('c:/test_porgram.txt', 'w')
    try:
        ##### Create a Statement
        myStmt = myCon.createStatement()
        ##### Run a Select Query and get a Result Set
        myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
        ##### Loop over the Result Set and print the result in a file
        while (myRs.next()):
            print >> f , "%s,%s,%s,%s" %(myRs.getString("EMP_ID"),myRs.getString("FIRST_NAME"),myRs.getString("LAST_NAME"),myRs.getString("DATE_OF_JOIN") )
    finally:
        myCon.close()
        f.close()

### Entry Point of the program
if __name__ == '__main__':
    main()

原文

I have written this program for connecting and fetching the data into file, but this program is so slow in fetching . is there is any way to improve the performance and faster way to load the data into the file . iam targeting around 100,000 to million of records so thats why iam worried about performance and also can i use array fetch size and batch size as we can do in java.

import java.sql as sql
import java.lang as lang
def main():
    driver, url, user, passwd = ('oracle.jdbc.driver.OracleDriver','jdbc:oracle:thin:@localhost:1521:xe','odi_temp','odi_temp')
    ##### Register Driver
    lang.Class.forName(driver)
    ##### Create a Connection Object
    myCon = sql.DriverManager.getConnection(url, user, passwd)
    f = open('c:/test_porgram.txt', 'w')
    try:
        ##### Create a Statement
        myStmt = myCon.createStatement()
        ##### Run a Select Query and get a Result Set
        myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
        ##### Loop over the Result Set and print the result in a file
        while (myRs.next()):
            print >> f , "%s,%s,%s,%s" %(myRs.getString("EMP_ID"),myRs.getString("FIRST_NAME"),myRs.getString("LAST_NAME"),myRs.getString("DATE_OF_JOIN") )
    finally:
        myCon.close()
        f.close()

### Entry Point of the program
if __name__ == '__main__':
    main()

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

妖妓 2024-08-21 06:46:59

除非您使用最好、最好的数据库和文件服务器设备，或者运行脚本的最差设备，否则此应用程序是 I/O 绑定的。选择从数据库返回后，数据的实际移动将比 Jython、Java 或此代码中的任何低效率问题更占主导地位。

你的CPU在这个过程中基本上是无意识的，你只是没有进行足够的数据转换。您可以编写一个比 I/O 慢的进程，但这不是其中之一。

您可以用 C 语言编写此代码，但我怀疑您不会看到实质性差异。

回复收藏 0 原文

墟烟 2024-08-21 06:46:59

难道不能使用 Oracle 命令行 SQL 客户端直接将该查询结果导出到 CSV 文件中吗？

回复收藏 0 原文

妥活 2024-08-21 06:46:59

您可以将 getString 与硬编码索引一起使用，而不是使用列名称（在 print 语句中），这样程序就不必一遍又一遍地查找名称。另外，我对 Jython/Python 文件输出了解不够，无法判断默认情况下是否启用此功能，但您应该尝试确保输出已缓冲。

编辑：

请求的代码（我不声明该代码的正确性）：

print >> f , "%s,%s,%s,%s" %(myRs.getString(0),myRs.getString(1),myRs.getString(2),myRs.getString(3) )

或

myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
hasFirst = myRs.next()
if (hasFirst):
    empIdIdx = myRs.findColumn("EMP_ID")
    fNameIdx = myRs.findColumn("FIRST_NAME")
    lNameIdx = myRs.findColumn("LAST_NAME")
    dojIdx = myRs.findColumn("DATE_OF_JOIN")
    print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )
    ##### Loop over the Result Set and print the result in a file
    while (myRs.next()):
        print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )

You might use getString with hardcoded indices instead of the column name (in your print statement) so the program doesn't have to look up the names over and over. Also, I don't know enough about Jython/Python file output to say whether this is enabled by default or not, but you should try to make sure your output is buffered.

EDIT:

Code requested (I make no claims about the correctness of this code):

print >> f , "%s,%s,%s,%s" %(myRs.getString(0),myRs.getString(1),myRs.getString(2),myRs.getString(3) )

myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
hasFirst = myRs.next()
if (hasFirst):
    empIdIdx = myRs.findColumn("EMP_ID")
    fNameIdx = myRs.findColumn("FIRST_NAME")
    lNameIdx = myRs.findColumn("LAST_NAME")
    dojIdx = myRs.findColumn("DATE_OF_JOIN")
    print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )
    ##### Loop over the Result Set and print the result in a file
    while (myRs.next()):
        print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )

回复收藏 0 原文