Python搁置模块问题

发布于 2024-07-13 06:18:48 字数 47 浏览 11 评论 0原文

Python shelve 模块是否内置任何保护以确保两个进程不会同时写入文件?

Does the Python shelve module have any protection built in to make sure two processes aren't writing to a file at the same time?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

挽心 2024-07-20 06:18:48

shelve 模块使用底层数据库包(例如 dbm、gdbm 或bsddb)。

限制段落说(我的重点):

搁置模块不支持对搁置对象的并发读/写访问。 (多个同时读取访问是安全的。)当程序打开一个架子进行写入时,其他程序不应打开它进行读取或写入。 Unix 文件锁定可用于解决此问题,但这在不同 Unix 版本之间有所不同,并且需要了解所使用的数据库实现。

结论:这取决于操作系统和底层数据库。 为了保持可移植性,不要建立在并发之上。

The shelve module uses an underlying database package (such as dbm, gdbm or bsddb) .

The restrictions pragraph says (my emphasis):

The shelve module does not support concurrent read/write access to shelved objects. (Multiple simultaneous read accesses are safe.) When a program has a shelf open for writing, no other program should have it open for reading or writing. Unix file locking can be used to solve this, but this differs across Unix versions and requires knowledge about the database implementation used.

Conclusion: it depends on OS and the underlying DB. To keep things portable, do not build on concurrency.

剪不断理还乱 2024-07-20 06:18:48

我已经将 Ivo 的方法实现为上下文管理器,对于任何感兴趣的人:

from contextlib import contextmanager
from fcntl import flock, LOCK_SH, LOCK_EX, LOCK_UN
import shelve

@contextmanager
def locking(lock_path, lock_mode):
    with open(lock_path, 'w') as lock:
        flock(lock.fileno(), lock_mode) # block until lock is acquired
        try:
            yield
        finally:
            flock(lock.fileno(), LOCK_UN) # release

class DBManager(object):
    def __init__(self, db_path):
        self.db_path = db_path

    def read(self):
        with locking("%s.lock" % self.db_path, LOCK_SH):
            with shelve.open(self.db_path, "r", 2) as db:
                return dict(db)

    def cas(self, old_db, new_db):
        with locking("%s.lock" % self.db_path, LOCK_EX):
            with shelve.open(self.db_path, "c", 2) as db:
                if old_db != dict(db):
                    return False
                db.clear()
                db.update(new_db)
                return True

I've implemented Ivo's approach as a context manager, for anyone interested:

from contextlib import contextmanager
from fcntl import flock, LOCK_SH, LOCK_EX, LOCK_UN
import shelve

@contextmanager
def locking(lock_path, lock_mode):
    with open(lock_path, 'w') as lock:
        flock(lock.fileno(), lock_mode) # block until lock is acquired
        try:
            yield
        finally:
            flock(lock.fileno(), LOCK_UN) # release

class DBManager(object):
    def __init__(self, db_path):
        self.db_path = db_path

    def read(self):
        with locking("%s.lock" % self.db_path, LOCK_SH):
            with shelve.open(self.db_path, "r", 2) as db:
                return dict(db)

    def cas(self, old_db, new_db):
        with locking("%s.lock" % self.db_path, LOCK_EX):
            with shelve.open(self.db_path, "c", 2) as db:
                if old_db != dict(db):
                    return False
                db.clear()
                db.update(new_db)
                return True
甜点 2024-07-20 06:18:48

根据最上面的答案,搁置多个作家是不安全的。 我使货架更安全的方法是编写一个包装器来负责打开和访问货架元素。 包装器代码如下所示:

def open(self, mode=READONLY):
    if mode is READWRITE:
        lockfilemode = "a" 
        lockmode = LOCK_EX
        shelve_mode = 'c'
    else:
        lockfilemode = "r"
        lockmode = LOCK_SH
        shelve_mode = 'r'
    self.lockfd = open(shelvefile+".lck", lockfilemode)
    fcntl.flock(self.lockfd.fileno(), lockmode | LOCK_NB)
    self.shelve = shelve.open(shelvefile, flag=shelve_mode, protocol=pickle.HIGHEST_PROTOCOL))
def close(self):
    self.shelve.close()
    fcntl.flock(self.lockfd.fileno(), LOCK_UN)
    lockfd.close()

As per the top answer, it's not safe to have multiple writers to the shelve. My approach to making shelves safer is to write a wrapper that takes care of opening and accessing shelve elements. The wrapper code looks something like this:

def open(self, mode=READONLY):
    if mode is READWRITE:
        lockfilemode = "a" 
        lockmode = LOCK_EX
        shelve_mode = 'c'
    else:
        lockfilemode = "r"
        lockmode = LOCK_SH
        shelve_mode = 'r'
    self.lockfd = open(shelvefile+".lck", lockfilemode)
    fcntl.flock(self.lockfd.fileno(), lockmode | LOCK_NB)
    self.shelve = shelve.open(shelvefile, flag=shelve_mode, protocol=pickle.HIGHEST_PROTOCOL))
def close(self):
    self.shelve.close()
    fcntl.flock(self.lockfd.fileno(), LOCK_UN)
    lockfd.close()
稀香 2024-07-20 06:18:48

基于 Ivo 的Samus_ 的< /a> 方法,我为 shelve.open 实现了一个更简单的包装器:

import fcntl
import shelve
import contextlib
import typing


@contextlib.contextmanager
def open_safe_shelve(db_path: str, flag: typing.Literal["r", "w", "c", "n"] = "c", protocol=None, writeback=False):
    if flag in ("w", "c", "n"):
        lockfile_lock_mode = fcntl.LOCK_EX
    elif flag == "r":
        lockfile_lock_mode = fcntl.LOCK_SH
    else:
        raise ValueError(f"Invalid mode: {flag}, only 'r', 'w', 'c', 'n' are allowed.")

    with open(f"{db_path}.lock", "w") as lock:  # According to https://docs.python.org/3/library/fcntl.html#fcntl.flock, the file must be opened in write mode on some systems.
        fcntl.flock(lock.fileno(), lockfile_lock_mode)  # Block until lock is acquired.
        try:
            yield shelve.open(db_path, flag=flag, protocol=protocol, writeback=writeback)
        finally:
            fcntl.flock(lock.fileno(), fcntl.LOCK_UN)  # Release lock

这避免了必须检查 dict 自上次以来是否已更改,就像 Samus_ 的 cas() 方法一样。

请注意,这将阻塞,直到获得锁为止。 如果您想在已获取锁定的情况下引发异常,请使用lockfile_lock_mode | fcntl.LOCK_NB 作为锁定标志。

它的使用方式与通常使用架子的方式相同。 例如:

import time
import multiprocessing

def read(db_path: str):
    print("Reading wants lock")
    with open_safe_shelve(db_path, "r") as db:
        print("Reading has lock")
        print(f"foo: {db.get('foo', None)}")
        time.sleep(10)
        print(f"foo: {db.get('foo', None)}")
        print("Reading giving up lock")


def write(db_path: str):
    print("Writing wants lock")
    with open_safe_shelve(db_path) as db:
        print("Writing has lock")
        db["foo"] = "bar"
        print("Writing giving up lock")


if __name__ == "__main__":
    db_path = "test_database"
    read_process = multiprocessing.Process(target=read, args=(db_path,))
    write_process = multiprocessing.Process(target=write, args=(db_path,))
    read_process.start()
    time.sleep(1)
    write_process.start()
    read_process.join()
    write_process.join()

将输出(假设 test_database.db 已经存在):

Reading wants lock
Reading has lock
foo: None
Writing wants lock
# (sleeps for around 9 seconds)
foo: None
Reading giving up lock
Writing has lock
Writing giving up lock

Building on Ivo's and Samus_'s approaches, I've implemented an even simpler wrapper for shelve.open:

import fcntl
import shelve
import contextlib
import typing


@contextlib.contextmanager
def open_safe_shelve(db_path: str, flag: typing.Literal["r", "w", "c", "n"] = "c", protocol=None, writeback=False):
    if flag in ("w", "c", "n"):
        lockfile_lock_mode = fcntl.LOCK_EX
    elif flag == "r":
        lockfile_lock_mode = fcntl.LOCK_SH
    else:
        raise ValueError(f"Invalid mode: {flag}, only 'r', 'w', 'c', 'n' are allowed.")

    with open(f"{db_path}.lock", "w") as lock:  # According to https://docs.python.org/3/library/fcntl.html#fcntl.flock, the file must be opened in write mode on some systems.
        fcntl.flock(lock.fileno(), lockfile_lock_mode)  # Block until lock is acquired.
        try:
            yield shelve.open(db_path, flag=flag, protocol=protocol, writeback=writeback)
        finally:
            fcntl.flock(lock.fileno(), fcntl.LOCK_UN)  # Release lock

This avoids having to check if the dict has changed since the last time, like in Samus_'s cas() method.

Note that this will block until the lock can be obtained. If you instead want to throw an exception if the lock is already taken, use lockfile_lock_mode | fcntl.LOCK_NB as the lock flag.

It can be used in the same way shelve would normally be used. For example:

import time
import multiprocessing

def read(db_path: str):
    print("Reading wants lock")
    with open_safe_shelve(db_path, "r") as db:
        print("Reading has lock")
        print(f"foo: {db.get('foo', None)}")
        time.sleep(10)
        print(f"foo: {db.get('foo', None)}")
        print("Reading giving up lock")


def write(db_path: str):
    print("Writing wants lock")
    with open_safe_shelve(db_path) as db:
        print("Writing has lock")
        db["foo"] = "bar"
        print("Writing giving up lock")


if __name__ == "__main__":
    db_path = "test_database"
    read_process = multiprocessing.Process(target=read, args=(db_path,))
    write_process = multiprocessing.Process(target=write, args=(db_path,))
    read_process.start()
    time.sleep(1)
    write_process.start()
    read_process.join()
    write_process.join()

will output (assuming test_database.db already exists):

Reading wants lock
Reading has lock
foo: None
Writing wants lock
# (sleeps for around 9 seconds)
foo: None
Reading giving up lock
Writing has lock
Writing giving up lock
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文