SQLAlchemy 尝试访问错误数据库中的表

发布于 2025-01-01 22:51:56 字数 2123 浏览 3 评论 0原文

我们使用 CherryPy 和 SQLAlchemy 构建我们的 Web 应用程序，一切都很好，直到我们用 2 个并发用户进行测试 - 然后事情开始出错！对于网络应用程序来说不是很好，所以如果有人能对此有所启发，我将非常感激。

TL;DR

当两个用户同时使用我们的网站（但访问不同的数据库）时，大约 10% 的情况下，我们会收到以下错误：

ProgrammingError: (ProgrammingError) (1146, "Table 'test_one.other_child_entity' doesn't exist")

该表不存在于该数据库中，因此该错误是有道理的，但问题是 SQLAlchemy 不应该在该数据库中查找该表。

我在这里的示例中重现了该错误 https://gist.github.com/1729817

说明

我们正在开发一个非常动态的应用程序，并且基于 entity_name 模式。 href="http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName" rel="nofollow">http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName

我们'从那时起，我们就发展了这个想法，以便根据您登录的用户身份将实体存储在不同的数据库中。这是因为系统中的每个用户都有自己的数据库，并且可以创建自己的实体（表）。为此，我们为每个数据库扩展一个基本实体，然后为他们在数据库中创建的每个附加实体扩展该新实体。

当应用程序启动时，我们创建一个包含所有这些数据库的引擎、元数据、类和表的字典，并反映所有元数据。当用户登录时，他们可以访问一个。

当两个用户同时访问该站点时，就会出现问题，SQLAlchemy 最终会在错误的数据库中查找表。我想这与线程有关，但据我所知，我们在会话（CP 和 SQLA）、引擎、元数据、表和映射器方面遵循所有规则。

如果有人可以快速浏览一下我的示例（https://gist.github.com/1729817）并指出任何明显的问题，那就太好了。

更新

我可以通过更改代码以使用我自己的自定义会话路由器来解决问题，如下所示：

# Thank you zzzeek (http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/)
class RoutingSession(Session):
    def get_bind(self, mapper = None, clause = None):
        return databases[cherrypy.session.get('database')]['engine']

然后：

Session = scoped_session(sessionmaker(autoflush = True, autocommit = False, class_ = RoutingSession))

因此只需对其进行硬编码以返回链接到会话中设置的数据库的引擎。这是个好消息，但现在我想知道为什么我的原始代码不起作用。要么我做错了，要么下面的代码不完全安全：

# Before each request (but after the session tool)
def before_request_body():
    if cherrypy.session.get('logged_in', None) is True:
        # Configure the DB session for this thread to point to the correct DB
        Session.configure(bind = databases[cherrypy.session.get('database')]['engine'])

我猜这里发生的绑定被另一个线程中的用户覆盖，这很奇怪，因为我认为 scoped_session 就是全部关于线程安全？

原文

We use CherryPy and SQLAlchemy to build our web app and everything was fine until we tested with 2 concurrent users - then things started to go wrong! Not very good for a web app so I'd be very appreciative if anyone could shine some light on this.

TL;DR

We're getting the following error about 10% of the time when two users are using our site (but accessing different databases) at the same time:

ProgrammingError: (ProgrammingError) (1146, "Table 'test_one.other_child_entity' doesn't exist")

This table is not present in that database so the error makes sense but the problem is that SQLAlchemy shouldn't be looking for the table in that database.

I have reproduced the error in an example here https://gist.github.com/1729817

Explanation

We're developing an application that is very dynamic and is based on the entity_name pattern found at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/EntityName

We've since grown that idea so that it stores entities in different databases depending on what user you're logged in as. This is because each user in the system has their own database and can create their own entities (tables). To do this we extend a base entity for each database and then extend that new entity for each additional entity they create in their database.

When the app starts we create a dictionary containing the engine, metadata, classes and tables of all these databases and reflect all of the metadata. When a user logs in they get access to one.

When two users are accessing the site at the same time something is going wrong and SQLAlchemy ends up looking for tables in the wrong database. I guess this is to do with threading but as far as I can see we are following all the rules when it comes to sessions (CP and SQLA), engines, metadata, tables and mappers.

If anyone could give my example (https://gist.github.com/1729817) a quick glance over and point out any glaring problems that would be great.

Update

I can fix the problem by changing the code to use my own custom session router like so:

# Thank you zzzeek (http://techspot.zzzeek.org/2012/01/11/django-style-database-routers-in-sqlalchemy/)
class RoutingSession(Session):
    def get_bind(self, mapper = None, clause = None):
        return databases[cherrypy.session.get('database')]['engine']

And then:

Session = scoped_session(sessionmaker(autoflush = True, autocommit = False, class_ = RoutingSession))

So just hard-coding it to return the engine that's linked to the database that's set in the session. This is great news but now I want to know why my original code didn't work. Either I'm doing it wrong or the following code is not completely safe:

# Before each request (but after the session tool)
def before_request_body():
    if cherrypy.session.get('logged_in', None) is True:
        # Configure the DB session for this thread to point to the correct DB
        Session.configure(bind = databases[cherrypy.session.get('database')]['engine'])

I guess that the binding that's happening here was being overwritten by the user in the other thread which is strange because I thought scoped_session was all about thread-safety?

分享到QQ

分享到微博