如何加快具有多个关系的 session.add 速度
我的源代码中的模型采用以下格式。 字典中的数组 字典中的数组 字典中的数组...
# data structure
user_list = [{user_name: 'A',
email: '[email protected]',
items:[{name:'a_itme1', properties:[{1....},{2....}...]}
]} * 100]
我正在尝试使用 SQLAlchemy 将上述数据放入 postgresql 数据库中。 有用户表、实体表和属性表。 还有分别链接用户和项目、项目和属性的表。
for u in user_list:
new_user = User(user_name=u.get('user_name'),....)
session.add(new_user)
session.flush()
for item in u.get('items'):
new_item = Item(name=item.get('name'),.....)
session.add(new_item)
session.flush()
new_item_link = UserItemLink(user_id=new_user.id, item_id=new_item.id,...)
session.add(new_item_link)
session.flush()
for prop in item.properties:
new_properties = Properties(name=prop.get('name'),...)
session.add(new_properties)
session.flush()
new_prop_link = ItemPropLink(item_id=new_item.id, prop_id=new_properties.id,...)
session.add(new_prop_link)
session.flush()
session.commit()
我的模型如下所示:
class User(Base):
__tablename__ = 'user'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(20))
email = Column(String(50))
user_item_link = relationship('UserItemLink', back_populates='user')
class Item(Base):
__tablename__ = 'item'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(50))
note = Column(String(50))
user_item_link = relationship('UserItemLink', back_populates='item')
class Properties(Base):
__tablename__ = 'properties'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(50))
value = Column(String(50))
item_prop_link = relationship('ItemPropLink', back_populates='properties')
class UserItemLink(Base):
__tablename__ = 'user_item_link'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
user_id = Column(ForeignKey('db.user.id'), nullable=False)
item_id = Column(ForeignKey('db.item.id'), nullable=False)
为了更好地理解,上述来源已被简化。 当使用上述信息依次执行session.add()时,需要花费大量时间。 当输入100条用户信息时,有8秒以上的延迟。
请建议提高 python 速度和 sqlalchemy 速度。
The model in my source code is in the format below.
Array in dict Array in dict Array in dict...
# data structure
user_list = [{user_name: 'A',
email: '[email protected]',
items:[{name:'a_itme1', properties:[{1....},{2....}...]}
]} * 100]
I'm trying to put the above data into a postgresql db with SQLAlchemy.
There is a user table, an entity table, and an attribute table.
And there are tables that link users and items, and items and properties respectively.
for u in user_list:
new_user = User(user_name=u.get('user_name'),....)
session.add(new_user)
session.flush()
for item in u.get('items'):
new_item = Item(name=item.get('name'),.....)
session.add(new_item)
session.flush()
new_item_link = UserItemLink(user_id=new_user.id, item_id=new_item.id,...)
session.add(new_item_link)
session.flush()
for prop in item.properties:
new_properties = Properties(name=prop.get('name'),...)
session.add(new_properties)
session.flush()
new_prop_link = ItemPropLink(item_id=new_item.id, prop_id=new_properties.id,...)
session.add(new_prop_link)
session.flush()
session.commit()
My models look like this:
class User(Base):
__tablename__ = 'user'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(20))
email = Column(String(50))
user_item_link = relationship('UserItemLink', back_populates='user')
class Item(Base):
__tablename__ = 'item'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(50))
note = Column(String(50))
user_item_link = relationship('UserItemLink', back_populates='item')
class Properties(Base):
__tablename__ = 'properties'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
name = Column(String(50))
value = Column(String(50))
item_prop_link = relationship('ItemPropLink', back_populates='properties')
class UserItemLink(Base):
__tablename__ = 'user_item_link'
id = Column(Integer, Identity(always=True, start=1, increment=1, minvalue=1, maxvalue=2147483647, cycle=False, cache=1), primary_key=True)
user_id = Column(ForeignKey('db.user.id'), nullable=False)
item_id = Column(ForeignKey('db.item.id'), nullable=False)
The above sources have been simplified for better understanding.
When session.add() is performed sequentially with the above information, it takes a lot of time.
When 100 user information is input, there is a delay of 8 seconds or more.
Please advise to improve python speed and sqlalchemy speed.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
当您在模型上配置了关系时,您可以使用这些关系来组合复杂的对象,而不是依赖于 id:
当会话在提交时刷新时,SQLAlchemy 将自动设置外键,从而无需手动刷新。
As you have relationships configured on the models you can compose complex objects using these relationships instead of relying on ids:
SQLAlchemy will automatically set the foreign keys when the session is flushed at commit time, removing the need to manually flush.