参照完整性 - 如何配置 SQLAlchemy?
有人可以帮助我了解配置 SQLAlchemy 模型时的总体情况,以便在各个级别确保引用完整性吗?
按照引用完整性应该由数据库表达和强制执行的想法,我创建了一个模式(当前在 Postgresql 中),其中包含我认为需要的所有约束,从而为我提供了一个我相信将强制执行引用完整性的数据库。
然后,我开始在该数据库之上构建一个应用程序,在声明模式下使用 SQLAlchemy (0.7)。
经过一番搜索和阅读后,我了解到我可以
- 在 Column() 定义上配置: onupdate/ondelete 规则。
- 我的关系()定义上的级联选项,
并且这些似乎在 SQLAlchemy 中的会话级别上运行。 - 我的关系()定义的passive_deletes和passive_updates选项。
所有这些选项都有默认值。
但我很困惑,我实际上需要对 SQLAlchemy 模型做多少事情,以确保 SQLAlchemy 在会话期间不会与数据库及其约束不同步。
如果我在 SQLAlchemy 中的 Columns() 定义上配置“onupdate”等,我到底能实现什么?
对于级联和passive_delete/passive_update规则,我可以在relationship()上配置。我在这里需要什么,为什么?
或者重新表述我的问题:SQLAlchemy 在多大程度上会了解数据库模式中配置的约束,以及我必须在多大程度上(以及如何)在我的模型中重复它们?
还有什么我应该注意的吗? :)
Can someone help me get the big picture when it comes to configuring SQLAlchemy models, so referential integrity is ensured at all levels?
Going by the idea that referential integrity should be expressed and enforced by the DB, I have created a schema (currently in Postgresql) with all the constraints I think I need, thus giving me a database I trust will enforce referential integrity.
I then begin to build an app on top of this DB, using SQLAlchemy (0.7) in declarative mode.
Having searched and read a bit, I have learned that I can configure:
- onupdate/ondelete rules on my Column() definitions.
- cascade options on my relationship() definitions,
and that these seem to operate on the session level in SQLAlchemy. - passive_deletes and passive_updates options for my relationship() definitions.
And that all these options have defaults.
But I am left confused as to how much I actually need to do with my SQLAlchemy models, to make sure SQLAlchemy doesn't get out of sync with the DB and its constraints during a session.
What exactly am I achieving, if I configure 'onupdate' etc. on my Columns() definitions in SQLAlchemy?
And for the cascade and passive_delete/passive_update rules I can configure on a relationship(). What do I need here, and why?
Or to rephrase my question: To what extend will SQLAlchemy be aware of the constraints configured in the DB schema, and to what extend (and how) do I have to repeat them in my models?
And are the anything else I should be aware of? :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
SQLAlchemy 从根本上不需要了解有关数据库所具有的约束的任何信息。如果您的数据库具有您想要配置的约束,那么您基本上就完成了 - 数据库不允许您的应用程序执行任何不应该执行的操作。
SQLAlchemy 的一个关键主题是它实际上只做您告诉它的事情。因此,如果您尝试持久保存一个对象 SubWidget(),该对象在数据库中需要引用父 Widget(),在 SQLAlchemy 刷新数据(即发出 INSERT 语句)时,该操作将失败数据库发出约束违规,事务将回滚。
因此,假设“subwidget”上的 FK 引用“widget”,您的应用程序需要确保数据的结构正确。有两种方法可以做到这一点;一是您手动维护那些包含外键引用的列,并确保它们在 INSERT 或 UPDATE 时具有适当的值。另一个是您将使用
relationship()
来管理外键属性,并且您将确保 SubWidget() 对象的创建伴随着将其与您单独创建和/或获取的父 Widget() 对象。关于级联,在适用的外键上使用 ON DELETE CASCADE 是一个好主意,尽管不是必需的。在 SQLAlchemy 方面,当使用
relationship()
时,您通常希望向 ORM 提供一个提示,即数据库将通过 Passive_deletes 标志进行级联删除 ( http://www.sqlalchemy.org/docs/orm/collections.html?highlight=passive_deletes#using-passive-deletes ),但这通常是一种性能增强;否则,SQLAlchemy 确保将relationship()
的依赖端表示的所有对象加载到内存中并进行适当处理,这意味着将外键属性设置为 NULL(默认值),或者将依赖对象标记为删除(通过将“cascade”设置为“all, delete-orphan”来实现,请参阅 http://www.sqlalchemy.org/docs/orm/session.html#cascades)。ON UPDATE 级联不太常见,因为自然主键如今已不常见,因为它们实际上的性能不如普通整数主键,并且在其他方面也可能很麻烦。 SQLAlchemy 也支持这些,但是它们通常会自行处理,因为 SQLA 默认情况下假设发生 PK 突变时会进行级联更新,请参阅 http://www.sqlalchemy.org/docs/orm/relationships.html#mutable-primary-keys-update-cascades 了解详细说明。
也许通过一些实验就可以更轻松地理解,基本思想是 SQLAlchemy 只发出您告诉它的 SQL,尽管它的许多 SQL 行为一旦预先配置就会自动执行。
relationship()
应该配置详细信息,说明您希望它根据数据库中存在的约束保留、修改或删除数据时的行为方式。SQLAlchemy doesn't fundamentally need to know anything about the constraints your database has. If your database has the constraints you want configured, you're essentially done - your application is disallowed by the database from doing anything it's not supposed to.
A key theme of SQLAlchemy is that it is really only doing what you tell it to. So if you were to try persisting an object, SubWidget(), which in the database needs to have a reference to a parent Widget(), at the point at which SQLAlchemy flushes the data (i.e. emits INSERT statements), the operation would fail with a constraint violation, emitted by the database, and the transaction is rolled back.
So assuming an FK on "subwidget" referencing "widget", your application needs to ensure the data is in the correct structure. There are two ways to do this; one is that you hand-maintain those columns which contain foreign key references, and ensure they have the appropriate value at the point of INSERT or UPDATE. The other is that you'd use
relationship()
to manage the foreign key attribute, and you'd instead ensure that the creation of a SubWidget() object is accompanied by the operation of associating it with a parent Widget() object which you've created and/or acquired separately.Regarding cascades, it's a great idea, though not required, to have ON DELETE CASCADE on those foreign keys where it applies. On the SQLAlchemy side, when using
relationship()
you usually want to give the ORM a hint that the database will cascade deletes via the passive_deletes flag ( http://www.sqlalchemy.org/docs/orm/collections.html?highlight=passive_deletes#using-passive-deletes ), however this is usually a performance enhancement; SQLAlchemy otherwise ensures that all objects represented on the dependent side of therelationship()
are loaded into memory, and handled appropriately, which means one of either setting the foreign key attribute to NULL (the default), or marking the dependent object for deletion (which occurs by setting "cascade" to "all, delete-orphan", see http://www.sqlalchemy.org/docs/orm/session.html#cascades).ON UPDATE cascade is less common as natural primary keys are not a common practice these days, since they don't actually perform as well as plain integer primary keys and can also be cumbersome in other ways. SQLAlchemy does support these as well however and they will generally take care of themselves, as SQLA assumes by default that update cascades are in place when a PK mutation takes place, see http://www.sqlalchemy.org/docs/orm/relationships.html#mutable-primary-keys-update-cascades for a detailed description of that.
Perhaps this is all easier to grok with a little experimentation, the basic idea is that SQLAlchemy is only emitting the SQL you're telling it to, even though many of its SQL behaviors are automated once configured up front.
relationship()
should be configured with details on how you'd like it to behave when data is persisted, modified, or deleted with respect to the constraints present in the database.因此,建立在 zzzeeks 答案以及我自己在最初的问题之后的研究/修补的基础上......
为了使 SQLAlchemy 主动防止数据库状态的会话中视图可能偏离数据库允许的内容当您刷新/提交时,您必须镜像 SQLAlchemy 模型中数据库模式中找到的所有约束。
这是通过以下形式的列定义完成的:
等等,可能包含 __table_args__,例如:
对于约束跨越多个列的情况。
而:
及其相关参数,如:
等等,是一个(重要的)便利功能,允许您以尽可能少的努力使用模型,但最终并不意味着阻止 违反引用完整性。
关系()功能不能表达数据库中的所有典型约束,而Column()(和__table_args__)功能可以。
另一方面,使用上面列出的一些参数(或使用最有意义的默认值)配置关系()将使 SQLAlchemy 自动执行任务,最终可以说是与引用完整性相关。否则通常必须用周围代码中的逻辑来表达。
在某些情况下,relationship() 的优化配置还会避免 SQLAlchemy 发出不必要的 SQL 语句。
希望这个总结有点准确......
So, building on zzzeeks answer, and my own studying/tinkering after my initial question...
To make SQLAlchemy actively prevent that an in-session view of the DB state can divert from what the DB will allow when you flush/commit, you must mirror all the constraints found in the DB schema in your SQLAlchemy models.
And this is done via the Column definitions in the form of:
and so on, with the possible inclusion of __table_args__, like:
For cases where the constraint spans multiple columns.
Whereas:
and its related arguments, like:
and so on, are a (important) convenience feature that allows you to work with your models with as little effort as possible, but ultimate aren't meant to be what prevents breaching referential integrity.
The relationship() feature can not express all the typical constraints in a database, while the Column() (and __table_args__) feature can.
On the other hand, configuring the relationship() with some of the arguments listed above (or going by default values where that makes most sense), will let SQLAlchemy perform tasks automatically, that ultimately can be said to be referential integerity related. And would otherwise normally have to be expressed by logic in the surrounding code.
Optimal configuration of relationship() will also in some cases avoid unnecessary SQL statements to be issued by SQLAlchemy.
Hope this summation is somewhat accurate...