使用 Tornado 和 pyodbc 进行异步调用

发布于 2024-10-20 07:11:21 字数 1310 浏览 2 评论 0原文

我想实现一个基于tornado的Web服务，可以为用户提供数据库查询服务。我使用 pyodbc 模块连接到数据库并进行查询。在实践中，我发现打印查询结果需要很长时间。也就是说，如果我使用下面的代码来打印查询结果

while 1:
    data = cursor.fetchone()
    if not data: break
    self.write(data + '\n')
    self.flush()

，而sql命令是类似

select * from <a large dummy table>

tornado的东西，直到循环结束后才会打印查询结果。而且需要很长时间。

我想利用tornado的非阻塞异步功能，让其他用户也能够使用该Web服务，即使打印当前用户的查询请求的循环尚未完成。

所以我写了类似的内容：

@tornado.web.asynchronous
def get(self):
    try:
        cnxn = pyodbc.connect(self.server, self.driver, self.table, self.uid, self.pwd)
    except Exception, e:
        print e
        return

    try:
        self.cur = cnxn.execute(self.sql)
    except Exception, e:
        print e
        return

    self.wait_for_query(callback=self.async_callback(self.on_finish))

def wait_for_query(self, callback):
    while 1:
       data = self.cur.fetchone()
       if not data: break
       self.write(data)
       self.flush()
    callback()

def on_finish(self):
    self.finish()

我读了这篇文章：使用 Tornado 和 Prototype 进行异步 COMET 查询并且知道我的解决方案行不通。但我当然不能使用 add_timeout，因为我无法计算迭代将持续多长时间。那么我该如何解决这个问题来实现我的目标呢？

原文

I want to implement a web service based on tornado which can provide the database query service for users. I used the pyodbc module to connect to database and do the query. In practice, I found that printing the query result would take a long time. That is to say, if I used the following code to print the query result

while 1:
    data = cursor.fetchone()
    if not data: break
    self.write(data + '\n')
    self.flush()

and the sql command is something like

select * from <a large dummy table>

tornado would not print the query result until the loop is over. And it takes a long time.

I wanna make use of tornado's non-blocking asynchronous feature to make other users also be able to use the web service even though the loop for printing the current user's query request is not finished yet.

So I write something like:

@tornado.web.asynchronous
def get(self):
    try:
        cnxn = pyodbc.connect(self.server, self.driver, self.table, self.uid, self.pwd)
    except Exception, e:
        print e
        return

    try:
        self.cur = cnxn.execute(self.sql)
    except Exception, e:
        print e
        return

    self.wait_for_query(callback=self.async_callback(self.on_finish))

def wait_for_query(self, callback):
    while 1:
       data = self.cur.fetchone()
       if not data: break
       self.write(data)
       self.flush()
    callback()

def on_finish(self):
    self.finish()

I read this post:
Asynchronous COMET query with Tornado and Prototype
and knew my solution wouldn't work. But I certainly cannot use add_timeout, 'cause there is no way for me to figure out how long an iteration would last. So how can I work through this to achieve my goal?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

痴梦一场 2024-10-27 07:11:21

为了允许单线程 Tornado 服务器在这样的请求中异步，您必须将控制权交还给 I/O 循环。尝试一下：

class LongRequestHandler(tornado.web.RequestHandler):
    def database_callback(self):
        data = self.cur.fetchone()
        if not data:
            self.finish()
            self.cnxn.close()
        else:
            self.write(data)
            self.flush()
            tornado.ioloop.IOLoop.instance().add_callback(self.database_callback)

    @tornado.web.asynchronous
    def get(self):
        try:
            self.cnxn = pyodbc.connect(self.server, self.driver, self.table, self.uid, self.pwd)
        except Exception, e:
            print e
            return

        try:
            self.cur = self.cnxn.execute(self.sql)
        except Exception, e:
            print e
            return

        tornado.ioloop.IOLoop.instance().add_callback(self.database_callback)

但是值得注意的是，每个数据库提供商都是不同的。据我了解，对于MySQL，大部分时间/处理实际上将花费在execute()调用上，而不是循环数据，因为MySQL处理整个查询并返回完整的结果集。如果您使用的数据库提供程序执行相同的操作，则可能需要在 Tornado 后面的工作进程中处理类似的请求。

编辑我的示例只是一个示例。实际上，您需要测试您的回调，并可能在返回之前循环遍历相当多的行，否则您会浪费大量的 CPU 时间在 IO 循环上的函数之间切换，而不是实际处理请求。在做了一些测试之后，我对 MySQL 的担心是真的——执行/查询语句本身就是导致锁定的原因，所以这个解决方案在这种情况下确实没有帮助。

In order to allow the single-threaded Tornado server to be asynchronous in a request like this you have to give control back to the I/O loop. Try this:

class LongRequestHandler(tornado.web.RequestHandler):
    def database_callback(self):
        data = self.cur.fetchone()
        if not data:
            self.finish()
            self.cnxn.close()
        else:
            self.write(data)
            self.flush()
            tornado.ioloop.IOLoop.instance().add_callback(self.database_callback)

    @tornado.web.asynchronous
    def get(self):
        try:
            self.cnxn = pyodbc.connect(self.server, self.driver, self.table, self.uid, self.pwd)
        except Exception, e:
            print e
            return

        try:
            self.cur = self.cnxn.execute(self.sql)
        except Exception, e:
            print e
            return

        tornado.ioloop.IOLoop.instance().add_callback(self.database_callback)

It's worth noting, however, that each database provider is different. It's my understanding that with MySQL, most of the time/processing will actually be spent on the execute() call rather than looping through the data, because MySQL processes the whole query and returns a complete result set. If you're using a database provider that does the same, you may need to process requests like this in a worker process that sits behind Tornado.

Edit My example is an example only. In reality, you'd want to test your callback and possibly loop through quite a few rows before returning, otherwise you waste an awful lot of CPU time switching between functions on the IO loop rather than actually processing the request. After doing some testing what I feared regarding MySQL is true - the execute/query statement itself is the one causing the lock, so this solution really won't help in that circumstance.

回复收藏 0 原文

~没有更多了~