在 SQLAlchemy ORM 对象上复制字典和使用深度复制时出现问题

发布于 2024-09-03 23:59:48 字数 3100 浏览 12 评论 0原文

我正在做一个模拟退火算法来优化给定的学生和项目分配。

这是来自维基百科的与语言无关的伪代码:

s ← s0; e ← E(s)                                // Initial state, energy.
sbest ← s; ebest ← e                            // Initial "best" solution
k ← 0                                           // Energy evaluation count.
while k < kmax and e > emax                     // While time left & not good enough:
  snew ← neighbour(s)                           // Pick some neighbour.
  enew ← E(snew)                                // Compute its energy.
  if enew < ebest then                          // Is this a new best?
    sbest ← snew; ebest ← enew                  // Save 'new neighbour' to 'best found'.
  if P(e, enew, temp(k/kmax)) > random() then   // Should we move to it?
    s ← snew; e ← enew                          // Yes, change state.
  k ← k + 1                                     // One more evaluation done
return sbest                                    // Return the best solution found.

以下是该技术的改编。我的主管说这个想法理论上很好。

首先,我从整组随机分配中选取一些分配(即学生及其分配项目的完整字典,包括项目的排名),复制它并将其传递给我的函数。我们将这个分配称为aOld(它是一个字典)。 aOld 有一个与其相关的权重,称为wOld。下面描述权重。

该函数执行以下操作:

  • 让此分配,aOld 成为 best_node
  • 从所有学生中,随机选择一些学生并将其放入列表中 剥离
  • (DEALLOCATE)他们他们的项目 ++ 反映了项目(已分配 参数现在为 False)和讲师(如果不再分配一个或多个项目,则释放空位)
  • 的 更改列表
  • 尝试再次分配(重新分配)该列表项目中的每个人
  • 计算权重(将排名相加,排名 1 = 1,排名 2 = 2...并且没有项目排名 = 101)
  • 对于此新分配 aNew,如果权重wNew小于我一开始选取的分配权重wOld,那么这就是best_node(定义为上面的模拟退火算法)。将算法应用于 aNew 并继续。
  • 如果wOld < wNew,然后再次将算法应用于aOld并继续。

分配/数据点表示为“节点”,这样 node = (weight, Allocation_dict,projects_dict, Lecturers_dict)

现在,我只能执行此算法一次,但我需要尝试使用数字 N(在维基百科片段中用 kmax 表示),并确保我始终拥有前一个 nodebest_node

为了不修改我原来的字典(我可能想重置),我做了字典的浅拷贝。从我在文档中读到的内容来看,它似乎只复制引用,并且由于我的字典包含对象,因此更改复制的字典最终会更改对象。所以我尝试使用copy.deepcopy()。这些字典引用已经用SQLA映射的对象。


问题:

针对所面临的问题,我得到了一些解决方案,但由于我对 Python 的使用非常陌生,所以它们对我来说听起来都相当神秘。

  1. Deepcopy 不能很好地与 SQLA 配合使用。有人告诉我,ORM 对象上的深度复制可能存在问题,无法按您的预期工作。显然我最好“构建复制构造函数,即 def copy(self): return FooBar(....)”。 有人可以解释一下这意味着什么吗?

  2. 我检查并发现 deepcopy 有问题,因为 SQLAlchemy 在您的对象上放置了额外的信息,即 _sa_instance_state< /code> 属性,我不想在副本中使用该属性,但该属性对于对象来说是必需的。有人告诉我:“有多种方法可以手动清除旧的 _sa_instance_state 并在对象上放置一个新的对象,但最简单的方法是使用 __init__() 创建一个新对象 并设置重要的属性,而不是进行完整的深层复制。” 这到底是什么意思?我是否要创建一个与旧的映射类类似的新的未映射类?

  3. 另一种解决方案是我必须“在您的对象上实现 __deepcopy__() 并确保设置新的 _sa_instance_state 时,sqlalchemy.orm.attributes 中的函数可以帮助实现这一点。” 这又超出了我的范围,所以有人可以解释一下它的含义吗?

  4. 一个更一般的问题:鉴于上述信息,是否有关于如何维护 < 的信息/状态的建议?如果我的实际对象(由字典引用,因此是节点)由于释放/重新分配而发生变化,则 code>best_node (必须始终在我的 while 循环中持续存在)和 previous_node正在发生?也就是说,不使用副本?

I'm doing a Simulated Annealing algorithm to optimise a given allocation of students and projects.

This is language-agnostic pseudocode from Wikipedia:

s ← s0; e ← E(s)                                // Initial state, energy.
sbest ← s; ebest ← e                            // Initial "best" solution
k ← 0                                           // Energy evaluation count.
while k < kmax and e > emax                     // While time left & not good enough:
  snew ← neighbour(s)                           // Pick some neighbour.
  enew ← E(snew)                                // Compute its energy.
  if enew < ebest then                          // Is this a new best?
    sbest ← snew; ebest ← enew                  // Save 'new neighbour' to 'best found'.
  if P(e, enew, temp(k/kmax)) > random() then   // Should we move to it?
    s ← snew; e ← enew                          // Yes, change state.
  k ← k + 1                                     // One more evaluation done
return sbest                                    // Return the best solution found.

The following is an adaptation of the technique. My supervisor said the idea is fine in theory.

First I pick up some allocation (i.e. an entire dictionary of students and their allocated projects, including the ranks for the projects) from entire set of randomised allocations, copy it and pass it to my function. Let's call this allocation aOld (it is a dictionary). aOld has a weight related to it called wOld. The weighting is described below.

The function does the following:

  • Let this allocation, aOld be the best_node
  • From all the students, pick a random number of students and stick in a list
  • Strip (DEALLOCATE) them of their projects ++ reflect the changes for projects (allocated parameter is now False) and lecturers (free up slots if one or more of their projects are no longer allocated)
  • Randomise that list
  • Try assigning (REALLOCATE) everyone in that list projects again
  • Calculate the weight (add up ranks, rank 1 = 1, rank 2 = 2... and no project rank = 101)
  • For this new allocation aNew, if the weight wNew is smaller than the allocation weight wOld I picked up at the beginning, then this is the best_node (as defined by the Simulated Annealing algorithm above). Apply the algorithm to aNew and continue.
  • If wOld < wNew, then apply the algorithm to aOld again and continue.

The allocations/data-points are expressed as "nodes" such that a node = (weight, allocation_dict, projects_dict, lecturers_dict)

Right now, I can only perform this algorithm once, but I'll need to try for a number N (denoted by kmax in the Wikipedia snippet) and make sure I always have with me, the previous node and the best_node.

So that I don't modify my original dictionaries (which I might want to reset to), I've done a shallow copy of the dictionaries. From what I've read in the docs, it seems that it only copies the references and since my dictionaries contain objects, changing the copied dictionary ends up changing the objects anyway. So I tried to use copy.deepcopy().These dictionaries refer to objects that have been mapped with SQLA.


Questions:

I've been given some solutions to the problems faced but due to my über green-ness with using Python, they all sound rather cryptic to me.

  1. Deepcopy isn't playing nicely with SQLA. I've been told thatdeepcopy on ORM objects probably has issues that prevent it from working as you'd expect. Apparently I'd be better off "building copy constructors, i.e. def copy(self): return FooBar(....)." Can someone please explain what that means?

  2. I checked and found out that deepcopy has issues because SQLAlchemy places extra information on your objects, i.e. an _sa_instance_state attribute, that I wouldn't want in the copy but is necessary for the object to have. I've been told: "There are ways to manually blow away the old _sa_instance_state and put a new one on the object, but the most straightforward is to make a new object with __init__() and set up the attributes that are significant, instead of doing a full deep copy." What exactly does that mean? Do I create a new, unmapped class similar to the old, mapped one?

  3. An alternate solution is that I'd have to "implement __deepcopy__() on your objects and ensure that a new _sa_instance_state is set up, there are functions in sqlalchemy.orm.attributes which can help with that." Once again this is beyond me so could someone kindly explain what it means?

  4. A more general question: given the above information are there any suggestions on how I can maintain the information/state for the best_node (which must always persist through my while loop) and the previous_node, if my actual objects (referenced by the dictionaries, therefore the nodes) are changing due to the deallocation/reallocation taking place? That is, without using copy?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

浮生未歇 2024-09-10 23:59:48

我有另一个可能的解决方案:使用事务。这可能仍然不是最好的解决方案,但实施起来应该更快。

首先像这样创建会话:

# transactional session
Session = sessionmaker(transactional=True)
sess = Session()

这样它将是事务性的。事务的工作方式是 sess.commit() 将使您的更改永久化,而 sess.rollback() 将恢复它们。

在模拟退火的情况下,您希望在找到新的最佳解决方案时进行承诺。在以后的任何时候,您都可以调用 rollback() 将状态恢复到该位置。

I have another possible solution: use transactions. This probably still isn't the best solution but implementing it should be faster.

Firstly create your session like this:

# transactional session
Session = sessionmaker(transactional=True)
sess = Session()

That way it will be transactional. The way transactions work is that sess.commit() will make your changes permanent while sess.rollback() will revert them.

In the case of simulated annealing you want to commit when you find a new best solution. At any later point, you can invoke rollback() to revert the status back to that position.

怀里藏娇 2024-09-10 23:59:48

您不想像这样复制 sqlalchemy 对象。您可以实现自己的方法,使副本足够容易,但这可能不是您想要的。您不希望数据库中存在学生和项目的副本,是吗?所以不要复制该数据。

所以你有一个字典来保存你的分配。在此过程中,您不应该修改 SQLAlchemy 对象。所有可以修改的信息都应该存储在这些字典中。如果您需要修改对象以考虑到这一点,请在最后将数据复制回来。

You don't want to copy sqlalchemy objects like that. You could implement your own methods which make the copies easily enough, but that is probably not want you want. You don't want copies of students and projects in your database do you? So don't copy that data.

So you have a dictionary which holds your allocations. During the process you should never modify the SQLAlchemy objects. All information that can be modified should be stored in those dictionaries. If you need to modify the objects to take that into account, copy the data back at the end.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文