Python 多线程
我有这样的场景:
使用 Zope/Plone 和一些我的 python API 创建的网页。有一个网页,称之为“a”,它通过 python 方法调用数据库(Postgres)并返回一些信息。在页面“a”上,您可以“离线”修改数据库数据(我的意思是更改不会立即写入数据库,而是在您按“保存”并调用 python API 方法时的第二刻写入)。因此,想象一下这样的场景:一个名为“Sam”的用户加载页面并开始修改数据。同时,一个名为“Sara”的用户通过页面“a”单击“保存”来修改数据库。现在 Sam 没有实际的数据库数据:他将按下“保存”并覆盖 Sara 的数据更改。
我的页面上会有实时提醒。我想我可以做这样的事情:
进行不可阻止的 AJAX 调用,然后继续进行页面渲染。 AJAX 调用一个 python 方法,该方法创建一个执行无限循环的线程(在“X”条件下)。当我在数据库上写入数据时,我将调用一个函数来更改“X 条件”,停止线程并返回 AJAX。
此外,我无法锁定数据库,因为我必须向每个想要修改我的数据库的用户提供免费访问权限。
我的问题是:如何识别 python 线程?我刚刚看到从 Thread 继承的类上的每个方法都需要“self”作为参数。此外,当我访问“a”页面时,我必须调用线程,这将位于代码中的某个位置(例如“线程模块”上),但插入位于另一个模块上。那么,我怎样才能实现我的想法呢?
如果有人有其他想法,请毫无问题地告诉我:)
I have this scenario:
A web page created with Zope/Plone and some mine python API. There's a web page, call it "a", that by a python method calls a database (Postgres) and returns some information. On page "a" you can modify database data "offline" (I intend that the changes aren't written in the database instantly but in a second moment when you press "save" and call a python API method). So, imagine this scenario: an user, called "Sam", loads the page and start to modify data. Meanwhile an user, called "Sara", modifies the database by the page "a" clicking "save". Now Sam doesn't have the actual database data: he'll push "save" and overwrite Sara's data change.
I would have an alert on my page in real time. I thought I can do something like this:
Make an AJAX call, that isn't blockable, and keep going with page render. The AJAX calls a python method that creates a thread that does an infinite loop (on an "X" condition). When I write data on database, I'll call a function that will change "X condition" stopping the thread and returning to AJAX.
Moreover, I can't lock the database because I have to give free access to every user that wants to modify my database.
My problem is: how can I identify a python thread ? I've just saw that every single method on a class that inherit from Thread wants "self" as parameter. Moreover, I have to call the thread as I access the "a" page and this will be somewhere in the code (say on the "threads module") but the inserts are on the other module. So, how can I realize my idea ?
And if someone have an alternative idea, tell me without any problem :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您正在讨论的问题领域通常称为“并发”。由于当目标项中的任何字段发生更改时,您的方法会警告或阻止用户更新,因此该方法通常称为“悲观并发”。实现此目的的一种方法是跟踪项目在被选择时的样子,并且仅当数据库版本看起来与您选择的版本完全相同或自特定时间以来未更新时才更新(时间戳字段可能会有所帮助) )。您还可以尝试乐观并发,其中仅检查一个用户已更新并保存回数据存储的字段是否由其他用户更新。如果您选择支持并发的 ORM 库,这两种方法都是最简单的。
我最喜欢的 python Web 库是 django,这里有一个关于您想要解决的相同情况的问题:Django:如何防止数据库条目的并发修改。我希望它有帮助。
按照您建议的方式处理并发是可行的,但在大多数情况下应该避免。我以前曾在向具有复杂对象的大型系统添加并发性时这样做过,这些对象具有广泛的副作用并且没有统一的数据访问(在系统的生命周期中有大约 5 种数据访问方法,这是一个丰富多彩的系统)。这是处理并发性的容易出错且复杂的方法(我认为我有一个客户端应用程序,并在描述对象的类型和标识符的数据表中标记“签出”项目后启动了一个观察者线程,签出它的用户,何时签出,以及有效期多长,以防签出对象的客户在完成后未能将其签入)。
如果您设置不使用 ORM 并在项目发生更改时向用户显示一条消息,请尝试关闭上次更新的时间戳列,然后让您的 ajax 调用检查上次更新时间是否大于它是您第一次加载该项目时的时间。因此,如果您编写通用方法来执行此操作,则只需要表名、主键和时间戳。
webservice 方法可能如下所示:
至于 python 多线程库,由于 python 的全局互锁问题,python 线程令人困惑并且性能不佳,在许多情况下您实际上可能希望生成一个新进程(多处理库相当等效并执行在并行处理场景中效果更好)。至于“self”,这是对您正在处理的类实例的引用的Pythonic约定,很像C类语言中的“this”。您可以在构造线程时为其指定唯一的名称,从而轻松识别该线程。请参阅 多处理 或 线程 文档以获取更多信息。如果您可以避免使用线程来解决此问题,我建议您这样做。
The realm of problem you're discussing is generally called, "Concurrency". Since your method would warn or block the user from updating when any field in the target item changes, the approach is usually called "Pessimistic Concurrency". One way to do this is to keep track of what the item looked like when it was selected, and only update if the database version looks exactly like the version you selected or has not been updated since a certain time (a timestamp field may be helpful). You could also try optimistic concurrency, in which you only check that fields one user has updated and is saving back to the datastore were not updated by the other user. Both of these methods are easiest if you choose an ORM library that supports concurrency.
My favorite python web library is django, and here is a question on SO about the same situation you are looking to solve: Django: How can I protect against concurrent modification of database entries. I hope it helps.
Handling concurrency in the manner you suggest is doable but should be avoided in most situations. I've done it before when adding concurrency to a large system with complex objects that had wide ranging side effects and no unified data access (there were about 5 methods of data access over the lifetime of the system, it was a colorful system). It's bug prone and complex way to handle concurrency (I think I had a client app and kicked off a watcher thread after marking items "checked out" in a data table that described the type and identifier of the object, the user who checked it out, when they checked it out, and how long it was valid for, in case the client who checked the object out failed to check it in when finished).
If you are set on not using an ORM and displaying a message to the user when changes have occurred to the item, try going off a last updated timestamp column and just have your ajax call check to see if the last update time is greater than it was when you first loaded the item. So, if you were coding a generic way to do this, you would simply need the table name, the primary key, and the timestamp.
webservice method might look like:
As for the python multithreading libraries, python threads are confusing and produce poor performance thanks to issues with python's global interlock, you may actually want to spawn a new process in many cases (the multiprocessing library is fairly equivalent and performs better in parallel processing scenarios). As far as "self" that's a pythonic convention for the reference to the instance of the class you're dealing with, much like "this" in C like languages. You could easily identify a thread by giving it a unique name when you construct it. See the multiprocessing or threading docs for more info. If you can avoid threading for this problem, I recommend that you do so.