将 @functools.lru_cache 与字典参数一起使用

发布于 2024-11-15 16:38:14 字数 375 浏览 3 评论 0原文

我有一个方法,它采用(除其他外)字典作为参数。该方法正在解析字符串,并且字典提供了某些子字符串的替换,因此它不必是可变的。

这个函数被经常调用,并且在冗余元素上被调用,所以我认为缓存它会提高它的效率。

但是,正如您可能已经猜到的那样,由于 dict 是可变的,因此不可散列,因此 @functools.lru_cache 无法装饰我的函数。那么我该如何克服这个问题呢?

如果只需要标准库类和方法,那就加分了。理想情况下,如果标准库中存在某种我还没有见过的frozendict,那会让我很开心。

PS:namedtuple仅在最后手段下使用,因为它需要很大的语法转换。

I have a method that takes (among others) a dictionary as an argument. The method is parsing strings and the dictionary provides replacements for some substrings, so it doesn't have to be mutable.

This function is called quite often, and on redundant elements so I figured that caching it would improve its efficiency.

But, as you may have guessed, since dict is mutable and thus not hashable, @functools.lru_cache can't decorate my function. So how can I overcome this?

Bonus point if it needs only standard library classes and methods. Ideally if it exists some kind of frozendict in standard library that I haven't seen it would make my day.

PS: namedtuple only in last resort, since it would need a big syntax shift.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

梦在深巷 2024-11-22 16:38:14

不要使用自定义的可哈希字典,而是使用它并避免重新发明轮子!这是一本冻结的字典,全部都是可散列的。

https://pypi.org/project/frozendict/

代码:

from frozendict import frozendict

def freezeargs(func):
    """Convert a mutable dictionary into immutable.
    Useful to be compatible with cache
    """

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = (frozendict(arg) if isinstance(arg, dict) else arg for arg in args)
        kwargs = {k: frozendict(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

然后

@freezeargs
@lru_cache
def func(...):
    pass

代码取自@fast_cen的答案

注意:这不适用于递归数据结构;例如,您可能有一个参数,它是一个不可散列的列表。我们邀请您进行递归包装,以便深入数据结构并使每个 dict 冻结和每个 list 元组。

(我知道OP不再想要解决方案,但我来这里寻找相同的解决方案,所以把这个留给后代)

Instead of using a custom hashable dictionary, use this and avoid reinventing the wheel! It's a frozen dictionary that's all hashable.

https://pypi.org/project/frozendict/

Code:

from frozendict import frozendict

def freezeargs(func):
    """Convert a mutable dictionary into immutable.
    Useful to be compatible with cache
    """

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = (frozendict(arg) if isinstance(arg, dict) else arg for arg in args)
        kwargs = {k: frozendict(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

and then

@freezeargs
@lru_cache
def func(...):
    pass

Code taken from @fast_cen 's answer

Note: this does not work on recursive datastructures; for example, you might have an argument that's a list, which is unhashable. You are invited to make the wrapping recursive, such that it goes deep into the data structure and makes every dict frozen and every list tuple.

(I know that OP nolonger wants a solution, but I came here looking for the same solution, so leaving this for future generations)

淡水深流 2024-11-22 16:38:14

这是一个使用 @mhyfritz 技巧的装饰器。

def hash_dict(func):
    """Transform mutable dictionnary
    Into immutable
    Useful to be compatible with cache
    """
    class HDict(dict):
        def __hash__(self):
            return hash(frozenset(self.items()))

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = tuple([HDict(arg) if isinstance(arg, dict) else arg for arg in args])
        kwargs = {k: HDict(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

只需将其添加到 lru_cache 之前即可。

@hash_dict
@functools.lru_cache()
def your_function():
    ...

Here is a decorator that use @mhyfritz trick.

def hash_dict(func):
    """Transform mutable dictionnary
    Into immutable
    Useful to be compatible with cache
    """
    class HDict(dict):
        def __hash__(self):
            return hash(frozenset(self.items()))

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = tuple([HDict(arg) if isinstance(arg, dict) else arg for arg in args])
        kwargs = {k: HDict(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

Simply add it before your lru_cache.

@hash_dict
@functools.lru_cache()
def your_function():
    ...
北陌 2024-11-22 16:38:14

创建一个可散列的 dict 类怎么样:

class HDict(dict):
    def __hash__(self):
        return hash(frozenset(self.items()))

substs = HDict({'foo': 'bar', 'baz': 'quz'})
cache = {substs: True}

What about creating a hashable dict class like so:

class HDict(dict):
    def __hash__(self):
        return hash(frozenset(self.items()))

substs = HDict({'foo': 'bar', 'baz': 'quz'})
cache = {substs: True}
﹎☆浅夏丿初晴 2024-11-22 16:38:14

基于@Cedar回答,按照建议添加深度冻结的递归:

def deep_freeze(thing):
    from collections.abc import Collection, Mapping, Hashable
    from frozendict import frozendict
    if thing is None or isinstance(thing, str):
        return thing
    elif isinstance(thing, Mapping):
        return frozendict({k: deep_freeze(v) for k, v in thing.items()})
    elif isinstance(thing, Collection):
        return tuple(deep_freeze(i) for i in thing)
    elif not isinstance(thing, Hashable):
        raise TypeError(f"unfreezable type: '{type(thing)}'")
    else:
        return thing


def deep_freeze_args(func):
    import functools

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        return func(*deep_freeze(args), **deep_freeze(kwargs))
    return wrapped

Based on @Cedar answer, adding recursion for deep freeze as suggested:

def deep_freeze(thing):
    from collections.abc import Collection, Mapping, Hashable
    from frozendict import frozendict
    if thing is None or isinstance(thing, str):
        return thing
    elif isinstance(thing, Mapping):
        return frozendict({k: deep_freeze(v) for k, v in thing.items()})
    elif isinstance(thing, Collection):
        return tuple(deep_freeze(i) for i in thing)
    elif not isinstance(thing, Hashable):
        raise TypeError(f"unfreezable type: '{type(thing)}'")
    else:
        return thing


def deep_freeze_args(func):
    import functools

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        return func(*deep_freeze(args), **deep_freeze(kwargs))
    return wrapped
此岸叶落 2024-11-22 16:38:14

如何子类化 namedtuple 并通过 x["key"] 添加访问权限?

class X(namedtuple("Y", "a b c")):
    def __getitem__(self, item):
        if isinstance(item, int):
            return super(X, self).__getitem__(item)
        return getattr(self, item)

How about subclassing namedtuple and add access by x["key"]?

class X(namedtuple("Y", "a b c")):
    def __getitem__(self, item):
        if isinstance(item, int):
            return super(X, self).__getitem__(item)
        return getattr(self, item)
北方。的韩爷 2024-11-22 16:38:14

这是一个可以像 functools.lru_cache 一样使用的装饰器。但这是针对仅采用一个参数的函数,该参数是具有可哈希值平面映射并且具有固定的maxsize 为 64。对于您的用例,您必须调整此示例或您的客户端代码。另外,要单独设置 maxsize,必须实现另一个装饰器,但我没有考虑这个,因为我不需要它。

from functools import (_CacheInfo, _lru_cache_wrapper, lru_cache,
                       partial, update_wrapper)
from typing import Any, Callable, Dict, Hashable

def lru_dict_arg_cache(func: Callable) -> Callable:
    def unpacking_func(func: Callable, arg: frozenset) -> Any:
        return func(dict(arg))

    _unpacking_func = partial(unpacking_func, func)
    _cached_unpacking_func = \
        _lru_cache_wrapper(_unpacking_func, 64, False, _CacheInfo)

    def packing_func(arg: Dict[Hashable, Hashable]) -> Any:
        return _cached_unpacking_func(frozenset(arg.items()))

    update_wrapper(packing_func, func)
    packing_func.cache_info = _cached_unpacking_func.cache_info
    return packing_func


@lru_dict_arg_cache
def uppercase_keys(arg: dict) -> dict:
    """ Yelling keys. """
    return {k.upper(): v for k, v in arg.items()}


assert uppercase_keys.__name__ == 'uppercase_keys'
assert uppercase_keys.__doc__ == ' Yelling keys. '
assert uppercase_keys({'ham': 'spam'}) == {'HAM': 'spam'}
assert uppercase_keys({'ham': 'spam'}) == {'HAM': 'spam'}
cache_info = uppercase_keys.cache_info()
assert cache_info.hits == 1
assert cache_info.misses == 1
assert cache_info.maxsize == 64
assert cache_info.currsize == 1
assert uppercase_keys({'foo': 'bar'}) == {'FOO': 'bar'}
assert uppercase_keys({'foo': 'baz'}) == {'FOO': 'baz'}
cache_info = uppercase_keys.cache_info()
assert cache_info.hits == 1
assert cache_info.misses == 3
assert cache_info.currsize == 3

对于更通用的方法,可以使用第三方的装饰器 @cachetools.cache具有适当函数设置为key的库。

Here's a decorator that can be used like functools.lru_cache. But this is targetted at functions that take only one argument which is a flat mapping with hashable values and has a fixed maxsize of 64. For your use-case you would have to adapt either this example or your client code. Also, to set the maxsize individually, one had to implement another decorator, but i haven't wrapped my head around this since i don't needed it.

from functools import (_CacheInfo, _lru_cache_wrapper, lru_cache,
                       partial, update_wrapper)
from typing import Any, Callable, Dict, Hashable

def lru_dict_arg_cache(func: Callable) -> Callable:
    def unpacking_func(func: Callable, arg: frozenset) -> Any:
        return func(dict(arg))

    _unpacking_func = partial(unpacking_func, func)
    _cached_unpacking_func = \
        _lru_cache_wrapper(_unpacking_func, 64, False, _CacheInfo)

    def packing_func(arg: Dict[Hashable, Hashable]) -> Any:
        return _cached_unpacking_func(frozenset(arg.items()))

    update_wrapper(packing_func, func)
    packing_func.cache_info = _cached_unpacking_func.cache_info
    return packing_func


@lru_dict_arg_cache
def uppercase_keys(arg: dict) -> dict:
    """ Yelling keys. """
    return {k.upper(): v for k, v in arg.items()}


assert uppercase_keys.__name__ == 'uppercase_keys'
assert uppercase_keys.__doc__ == ' Yelling keys. '
assert uppercase_keys({'ham': 'spam'}) == {'HAM': 'spam'}
assert uppercase_keys({'ham': 'spam'}) == {'HAM': 'spam'}
cache_info = uppercase_keys.cache_info()
assert cache_info.hits == 1
assert cache_info.misses == 1
assert cache_info.maxsize == 64
assert cache_info.currsize == 1
assert uppercase_keys({'foo': 'bar'}) == {'FOO': 'bar'}
assert uppercase_keys({'foo': 'baz'}) == {'FOO': 'baz'}
cache_info = uppercase_keys.cache_info()
assert cache_info.hits == 1
assert cache_info.misses == 3
assert cache_info.currsize == 3

For a more generic approach one could use the decorator @cachetools.cache from a third-party library with an appropriate function set as key.

念三年u 2024-11-22 16:38:14

在决定暂时放弃我们的用例的 lru 缓存后,我们仍然想出了一个解决方案。该装饰器使用 json 来序列化和反序列化发送到缓存的 args/kwargs。适用于任意数量的参数。使用它作为函数的装饰器而不是@lru_cache。最大大小设置为 1024。

def hashable_lru(func):
    cache = lru_cache(maxsize=1024)

    def deserialise(value):
        try:
            return json.loads(value)
        except Exception:
            return value

    def func_with_serialized_params(*args, **kwargs):
        _args = tuple([deserialise(arg) for arg in args])
        _kwargs = {k: deserialise(v) for k, v in kwargs.items()}
        return func(*_args, **_kwargs)

    cached_function = cache(func_with_serialized_params)

    @wraps(func)
    def lru_decorator(*args, **kwargs):
        _args = tuple([json.dumps(arg, sort_keys=True) if type(arg) in (list, dict) else arg for arg in args])
        _kwargs = {k: json.dumps(v, sort_keys=True) if type(v) in (list, dict) else v for k, v in kwargs.items()}
        return cached_function(*_args, **_kwargs)
    lru_decorator.cache_info = cached_function.cache_info
    lru_decorator.cache_clear = cached_function.cache_clear
    return lru_decorator

After deciding to drop lru cache for our use case for now, we still came up with a solution. This decorator uses json to serialise and deserialise the args/kwargs sent to the cache. Works with any number of args. Use it as a decorator on a function instead of @lru_cache. max size is set to 1024.

def hashable_lru(func):
    cache = lru_cache(maxsize=1024)

    def deserialise(value):
        try:
            return json.loads(value)
        except Exception:
            return value

    def func_with_serialized_params(*args, **kwargs):
        _args = tuple([deserialise(arg) for arg in args])
        _kwargs = {k: deserialise(v) for k, v in kwargs.items()}
        return func(*_args, **_kwargs)

    cached_function = cache(func_with_serialized_params)

    @wraps(func)
    def lru_decorator(*args, **kwargs):
        _args = tuple([json.dumps(arg, sort_keys=True) if type(arg) in (list, dict) else arg for arg in args])
        _kwargs = {k: json.dumps(v, sort_keys=True) if type(v) in (list, dict) else v for k, v in kwargs.items()}
        return cached_function(*_args, **_kwargs)
    lru_decorator.cache_info = cached_function.cache_info
    lru_decorator.cache_clear = cached_function.cache_clear
    return lru_decorator
守护在此方 2024-11-22 16:38:14

解决方案可能要简单得多。 lru_cache使用参数作为缓存的标识符,因此对于字典来说,lru_cache不知道如何解释它。您可以将字典参数序列化为字符串,然后在函数中反序列化为字典。就像魅力一样。

功能:

    @lru_cache(1024)
    def data_check(serialized_dictionary):
        my_dictionary = json.loads(serialized_dictionary)
        print(my_dictionary)

调用:

    data_check(json.dumps(initial_dictionary))

The solution might be much simpler. The lru_cache uses parameters as the identifier for caching, so in the case of the dictionary, lru_cache doesn't know how to interpret it. You can serialize dictionary parameter to string and unserialize in the function to the dictionary back. Works like a charm.

the function:

    @lru_cache(1024)
    def data_check(serialized_dictionary):
        my_dictionary = json.loads(serialized_dictionary)
        print(my_dictionary)

the call:

    data_check(json.dumps(initial_dictionary))
财迷小姐 2024-11-22 16:38:14

@Cedar 答案 的扩展,添加递归冻结:

递归冻结:

def recursive_freeze(value):
    if isinstance(value, dict):
        for k,v in value.items():
            value[k] = recursive_freeze(v)
        return frozendict(value)
    else:
        return value

# To unfreeze
def recursive_unfreeze(value):
    if isinstance(value, frozendict):
        value = dict(value)
        for k,v in value.items():
            value[k] = recursive_unfreeze(v)
    
    return value

装饰器:

def freezeargs(func):
    """
    Transform mutable dictionary into immutable.
    Useful to be compatible with cache
    """

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = tuple([recursive_freeze(arg) if isinstance(arg, dict) else arg for arg in args])
        kwargs = {k: recursive_freeze(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped

An extension of @Cedar answer, adding recursive freeze:

The recursive freeze:

def recursive_freeze(value):
    if isinstance(value, dict):
        for k,v in value.items():
            value[k] = recursive_freeze(v)
        return frozendict(value)
    else:
        return value

# To unfreeze
def recursive_unfreeze(value):
    if isinstance(value, frozendict):
        value = dict(value)
        for k,v in value.items():
            value[k] = recursive_unfreeze(v)
    
    return value

The decorator:

def freezeargs(func):
    """
    Transform mutable dictionary into immutable.
    Useful to be compatible with cache
    """

    @functools.wraps(func)
    def wrapped(*args, **kwargs):
        args = tuple([recursive_freeze(arg) if isinstance(arg, dict) else arg for arg in args])
        kwargs = {k: recursive_freeze(v) if isinstance(v, dict) else v for k, v in kwargs.items()}
        return func(*args, **kwargs)
    return wrapped
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文