当对象来自 SQLAlchemy 时,无法 pickle int 对象错误?
我正在使用 YAML 和 SQLAlchemy。我定义了我的对象,并且可以使用 YAML 来打印它。但是,当我尝试对从 SQLAlchemy 查询返回的对象使用 YAML 时,它失败并出现错误 can't pickle int objects
。我打印了从 SQLAlchemy 返回的实例,它显示了正确的类型。我会让代码来说话:
class HashPointer(Base):
__tablename__ = 'hash_pointers'
id = Column(Integer, primary_key=True)
hash_code = Column(VARBINARY(64), unique=True)
file_pointer = Column(Text)
def __init__(self, hash_code, file_pointer):
self.hash_code = hash_code
self.file_pointer = file_pointer
def __repr__(self):
return "<HashPointer('%s', '%s')>" % (self.hash_code, self.file_pointer)
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
Engine = create_engine("mysql://user:pass@localhost/db", echo=True)
Session = sessionmaker(bind=Engine)
session = Session()
fhash = HashPointer(0x661623708235, "c:\\test\\001.txt")
# PRINTS FINE
print(yaml.dump(fhash))
for instance in session.query(HashPointer).all():
# PRINTS FINE AS __repr__
print instance
# THROWS ERROR, 'CAN'T PICKLE INT OBJECTS'
print(yaml.dump(instance))
I am using YAML and SQLAlchemy. I defined my object, and I am able to use YAML to print that just fine. However, when I try to use YAML on the object returned from a SQLAlchemy query, it is failing with the error can't pickle int objects
. I printed out the instance returned from SQLAlchemy, and it is showing the correct type. I'll let the code do the talking:
class HashPointer(Base):
__tablename__ = 'hash_pointers'
id = Column(Integer, primary_key=True)
hash_code = Column(VARBINARY(64), unique=True)
file_pointer = Column(Text)
def __init__(self, hash_code, file_pointer):
self.hash_code = hash_code
self.file_pointer = file_pointer
def __repr__(self):
return "<HashPointer('%s', '%s')>" % (self.hash_code, self.file_pointer)
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
Engine = create_engine("mysql://user:pass@localhost/db", echo=True)
Session = sessionmaker(bind=Engine)
session = Session()
fhash = HashPointer(0x661623708235, "c:\\test\\001.txt")
# PRINTS FINE
print(yaml.dump(fhash))
for instance in session.query(HashPointer).all():
# PRINTS FINE AS __repr__
print instance
# THROWS ERROR, 'CAN'T PICKLE INT OBJECTS'
print(yaml.dump(instance))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试将以下内容添加到您的班级中:
Try adding the following to your class:
事实证明,当您激活 sqlalchemy 时,默认的 reduce_ex 方法(我很确定这是 object() 中的方法,但不一定是。)添加了PyYAML 用于执行序列化的 reduce_ex API 中返回的“状态”的
_sa_instance_state
成员。当序列化来自 SqlAlchemy 查询的对象时,这本质上是对象元数据的隐藏部分,可供进一步操作访问。
PyYAML 序列化程序正是在这个对象中发生故障。您可以通过在 PDB 中运行序列化并在调用堆栈中查看对represent_object 的两次调用来验证这一点,即使对于相对简单的 SQLAlchemy 查询对象结果也是如此。
据我了解,此查询实例链接用于为方法提供动力,让您可以查看在同一 python 解释器的生命周期内生成给定对象的查询。
如果您关心该功能(例如 session.new 和 session.dirty 等),则需要在 PyYAML 的序列化器中实现对此的支持。
如果您不在乎,并且只想要声明的成员,则可以使用基类来“隐藏”对 reduce* 调用的链接 - 请注意,这也会破坏 SQLAlchemy 序列化程序扩展不过,所以请仔细考虑你的计划。
实现该更改的基类的一个示例是:
这将允许您在 yaml 中往返对象,尽管往返会将它们与任何待处理的事务或查询解除关联。例如,如果您使用延迟加载的成员,这也可能会产生交互。确保您正在序列化您期望的所有内容。
注意/编辑:
我选择在这里使用 reduce_ex ,以与可能的其他基类或 mixins 兼容。根据 https://docs.python.org/2/library/pickle .html#object.reduce_ex,这将为任何基类产生正确的行为,同时检测是否仅声明了 reduce() 。
Redux...reduce 将返回实例对象的实际 dict ——我们不想从那里删除,因此对于 __reduce*,我们实际上必须浅复制该字典。
It turns out that the default reduce_ex method (im pretty sure this is the one in object(), but it doesn't have to be.) that comes down the line when you have sqlalchemy active, adds a
_sa_instance_state
member to the 'state' returned in the reduce_ex API which PyYAML uses to perform serialization.When serializing an object coming from a SqlAlchemy query, this is essentially a hidden part of the object's metadata, which is accessible to further operations.
It is this object in which the PyYAML serializer is failing. You can verify this by running your serialization in PDB, and seeing two calls to represent_object in your call stack, even for relatively simple SQLAlchemy query object results.
This query instance link is used, as I understand it, to power methods with let you poke back at the query which generates a given object from within the same python interpreter's lifetime.
If you care about that functionality (stuff like session.new & session.dirty), you will need to implement support for that in PyYAML's serializer.
If you don't care, and just want your declared members, you can use a base class which 'hides' that linkage from calls to reduce* -- note that this will also break the SQLAlchemy serializer extension as well though, so consider your plans carefully.
An example of a base class to implement that change is:
This will then allow you to roundtrip your objects in/out of yaml, though a round trip will disassociate them from any pending transactions or queries. This could also have interactions if you are using lazy-loaded members, for one example. Make sure you are serializing everything you expect.
NOTE/EDIT:
I chose to use reduce_ex here, to be compatible with possible other base classes or mixins. According to https://docs.python.org/2/library/pickle.html#object.reduce_ex, this will produce the correct behavior for any base classes, also detecting if only reduce() was declared.
Redux... reduce will return the actual dict of the instance object -- we don't want to delete in from there, so for __reduce*, we must actually shallow copy that dict.