用钥匙中的点分界符解析山药

发布于 2025-01-22 09:26:59 字数 698 浏览 4 评论 0原文

我们使用YAML配置进行服务缩放。通常情况下它是这样的:

service:
  scalingPolicy:
    capacity:
      min: 1
      max: 1 

因此,使用基本的PyyAml和解析很容易获得config ['service'] ['scaleingpolicy'] ['logites'] ['scaping'] ['min']结果AS 1。问题是,某些配置是用点定界符构建的:

service.scalingPolicy.capacity:
  min: 1
  max: 1

此配置的基本消费者是Java的弹簧,并且以某种方式被视为上述示例。但是,由于还需要使用Python解析这些配置 - 我将整个点分离为config ['service.scalingpolicy.capacity']键。

问题是 - 我将如何使Python解析任何类型的键组合(均由DOTS隔开 code>表格和:)。我没有找到python yaml libs的相关参数(我已经检查了标准的pyyaml和ruamel.yaml),并且手动处理任何可能的组合似乎是一个疯狂的想法。我唯一可能的想法是写自己的解析器,但也许我缺少一些东西,所以我不必重新发明自行车。

We use YAML configuration for services scaling. Usually it goes like this:

service:
  scalingPolicy:
    capacity:
      min: 1
      max: 1 

So it's easy to open with basic PyYAML and parse as an dict to get config['service']['scalingPolicy']['capacity']['min'] result as 1. Problem is that some configs are built with dots delimiter e.g:

service.scalingPolicy.capacity:
  min: 1
  max: 1

Basic consumer of this configs is Java's Spring and somehow it's treated equally as the example above. But due to need to also parse these configs with Python - I get whole dot separated line as a config['service.scalingPolicy.capacity'] key.

The question is - how would I make python parse any kind of keys combinations (both separated by dots and separated by tabulation and :). I didn't find related parameters for Python YAML libs (I've checked standard PyYAML and ruamel.yaml) and handling any possible combination manually seems like a crazy idea. The only possible idea I have is to write my own parser but maybe there is something I'm missing so I won't have to reinvent the bicycle.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

眸中客 2025-01-29 09:26:59

这不是微不足道的
将递归中的点变成嵌套的数据结构。在这里你有一个嵌套
数据结构和不同的[键]查找意味着不同的事物
在不同的级别。

如果您在默认的往返模式下使用ruamel.yaml,则可以添加一个类变量
对于代表映射的类型,该映射定义了键分开的内容和实例变量
这可以跟踪已经匹配的前缀:

import sys
import ruamel.yaml
from ruamel.yaml.compat import ordereddict
from ruamel.yaml.comments import merge_attrib

yaml_str = """\
service.scalingPolicy.capacity:
  min: 1
  max: 1
"""


def mapgetitem(self, key):
    sep = getattr(ruamel.yaml.comments.CommentedMap, 'sep')
    if sep is not None: 
        if not hasattr(self, 'splitprefix'):
           self.splitprefix = ''
        if self.splitprefix:
            self.splitprefix += sep + key
        else:
            self.splitprefix = key
        if self.splitprefix not in self:
            for k in self.keys():
                if k.startswith(self.splitprefix):
                    break
                else:
                    raise KeyError(self.splitprefix)
            return self
        key = self.splitprefix
        delattr(self, 'splitprefix') # to make the next lookup work from start
    try:
        return ordereddict.__getitem__(self, key)
    except KeyError:
        for merged in getattr(self, merge_attrib, []):
            if key in merged[1]:
                return merged[1][key]
        raise

old_mapgetitem = ruamel.yaml.comments.CommentedMap.__getitem__ # save the original __getitem__
ruamel.yaml.comments.CommentedMap.__getitem__ = mapgetitem
ruamel.yaml.comments.CommentedMap.sep = '.'

yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
# yaml.preserve_quotes = True
config = yaml.load(yaml_str)
print('min:', config['service']['scalingPolicy']['capacity']['min'])
print('max:', config['service']['scalingPolicy']['capacity']['max'])
print('---------')
config['service']['scalingPolicy']['capacity']['max'] = 42
# and dump with the original routine, as it uses __getitem__
ruamel.yaml.comments.CommentedMap.__getitem__ = old_mapgetitem
yaml.dump(config, sys.stdout)

它给出:

min: 1
max: 1
---------
service.scalingPolicy.capacity:
  min: 1
  max: 42

This is not trivial, it is much more easy to split a lookup with a key with
dots into recursing into a nested data structure. Here you have a nested
data structure and different [key] lookups mean different things
at different levels.

If you use ruamel.yaml in the default round-trip mode, you can add a class-variable
to the type that represents a mapping, that defines on what the keys were split and an instance variable
that keeps track of the prefix already matched:

import sys
import ruamel.yaml
from ruamel.yaml.compat import ordereddict
from ruamel.yaml.comments import merge_attrib

yaml_str = """\
service.scalingPolicy.capacity:
  min: 1
  max: 1
"""


def mapgetitem(self, key):
    sep = getattr(ruamel.yaml.comments.CommentedMap, 'sep')
    if sep is not None: 
        if not hasattr(self, 'splitprefix'):
           self.splitprefix = ''
        if self.splitprefix:
            self.splitprefix += sep + key
        else:
            self.splitprefix = key
        if self.splitprefix not in self:
            for k in self.keys():
                if k.startswith(self.splitprefix):
                    break
                else:
                    raise KeyError(self.splitprefix)
            return self
        key = self.splitprefix
        delattr(self, 'splitprefix') # to make the next lookup work from start
    try:
        return ordereddict.__getitem__(self, key)
    except KeyError:
        for merged in getattr(self, merge_attrib, []):
            if key in merged[1]:
                return merged[1][key]
        raise

old_mapgetitem = ruamel.yaml.comments.CommentedMap.__getitem__ # save the original __getitem__
ruamel.yaml.comments.CommentedMap.__getitem__ = mapgetitem
ruamel.yaml.comments.CommentedMap.sep = '.'

yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
# yaml.preserve_quotes = True
config = yaml.load(yaml_str)
print('min:', config['service']['scalingPolicy']['capacity']['min'])
print('max:', config['service']['scalingPolicy']['capacity']['max'])
print('---------')
config['service']['scalingPolicy']['capacity']['max'] = 42
# and dump with the original routine, as it uses __getitem__
ruamel.yaml.comments.CommentedMap.__getitem__ = old_mapgetitem
yaml.dump(config, sys.stdout)

which gives:

min: 1
max: 1
---------
service.scalingPolicy.capacity:
  min: 1
  max: 42
梦年海沫深 2025-01-29 09:26:59

我发现了使用Pyyaml的替代溶液。该策略是用“ k1.k2.k3:value”的格式将dict转换为字符串,然后将其转换为dict。可能不是最有效的,但它有效。

import yaml
from copy import deepcopy

my_yaml = """
service.scalingPolicy.capacity:
    min: 1
    max: 50
"""


class YamlUtils:

    def yaml_dict_to_text(self, yaml_dict, parent_key, yaml_text):
        return_value = yaml_text.split('\n')
        for key, value in yaml_dict.items():
            key_string = key
            if parent_key != "":
                key_string = parent_key + "." + key_string
            if isinstance(value, dict):
                return_value.append(self.yaml_dict_to_text(value, key_string, yaml_text))
            else:
                return_value.append("{0}: {1}".format(
                    key_string,
                    value
                )
                )
        return '\n'.join(return_value)

    @staticmethod
    def convert_to_dict(source_string, split_symbol='.', value=None):
        return_value = value
        elements = source_string.split(split_symbol)
        for element in reversed(elements):
            if element:
                return_value = {element: return_value}
        return return_value

    def dict_of_dicts_merge(self, x, y):
        z = {}
        try:
            overlapping_keys = x.keys() & y.keys()
            for key in overlapping_keys:
                z[key] = self.dict_of_dicts_merge(x[key], y[key])
            for key in x.keys() - overlapping_keys:
                z[key] = deepcopy(x[key])
            for key in y.keys() - overlapping_keys:
                z[key] = deepcopy(y[key])
        except Exception as e:
            print("Error merging dicts:", x, y, str(e))
        return z

    def text_to_yaml_dict(self, yaml_text):
        return_value = {}
        yaml_list = yaml_text.split('\n')
        for line in yaml_list:
            line_items = line.split(':')
            if len(line_items) >= 2:
                line_key = line_items[0]
                line_value = line_items[1].lstrip()
                line_dict = self.convert_to_dict(line_key, '.', line_value)
                return_value = self.dict_of_dicts_merge(return_value, line_dict)
        return return_value


def main():
    try:
        yaml_dict = yaml.safe_load(my_yaml)
        yaml_utils = YamlUtils()
        processed_yaml_text = yaml_utils.yaml_dict_to_text(
            yaml_dict,
            "",
            "",
        )
        processed_yaml_dict = yaml_utils.text_to_yaml_dict(processed_yaml_text)
        print("Min:", processed_yaml_dict['service']['scalingPolicy']['capacity']['min'])
        print("Max:", processed_yaml_dict['service']['scalingPolicy']['capacity']['max'])
        data = yaml.dump(processed_yaml_dict, indent=True)
        print("New yaml:", data)
    except yaml.YAMLError as exc:
        print(exc)


if __name__ == "__main__":
    main()

I have found an alternative solution with pyyaml. The strategy is convert the dict into a string with the format "k1.k2.k3: value" and then convert it into a dict. Probably not the most efficient, but it works.

import yaml
from copy import deepcopy

my_yaml = """
service.scalingPolicy.capacity:
    min: 1
    max: 50
"""


class YamlUtils:

    def yaml_dict_to_text(self, yaml_dict, parent_key, yaml_text):
        return_value = yaml_text.split('\n')
        for key, value in yaml_dict.items():
            key_string = key
            if parent_key != "":
                key_string = parent_key + "." + key_string
            if isinstance(value, dict):
                return_value.append(self.yaml_dict_to_text(value, key_string, yaml_text))
            else:
                return_value.append("{0}: {1}".format(
                    key_string,
                    value
                )
                )
        return '\n'.join(return_value)

    @staticmethod
    def convert_to_dict(source_string, split_symbol='.', value=None):
        return_value = value
        elements = source_string.split(split_symbol)
        for element in reversed(elements):
            if element:
                return_value = {element: return_value}
        return return_value

    def dict_of_dicts_merge(self, x, y):
        z = {}
        try:
            overlapping_keys = x.keys() & y.keys()
            for key in overlapping_keys:
                z[key] = self.dict_of_dicts_merge(x[key], y[key])
            for key in x.keys() - overlapping_keys:
                z[key] = deepcopy(x[key])
            for key in y.keys() - overlapping_keys:
                z[key] = deepcopy(y[key])
        except Exception as e:
            print("Error merging dicts:", x, y, str(e))
        return z

    def text_to_yaml_dict(self, yaml_text):
        return_value = {}
        yaml_list = yaml_text.split('\n')
        for line in yaml_list:
            line_items = line.split(':')
            if len(line_items) >= 2:
                line_key = line_items[0]
                line_value = line_items[1].lstrip()
                line_dict = self.convert_to_dict(line_key, '.', line_value)
                return_value = self.dict_of_dicts_merge(return_value, line_dict)
        return return_value


def main():
    try:
        yaml_dict = yaml.safe_load(my_yaml)
        yaml_utils = YamlUtils()
        processed_yaml_text = yaml_utils.yaml_dict_to_text(
            yaml_dict,
            "",
            "",
        )
        processed_yaml_dict = yaml_utils.text_to_yaml_dict(processed_yaml_text)
        print("Min:", processed_yaml_dict['service']['scalingPolicy']['capacity']['min'])
        print("Max:", processed_yaml_dict['service']['scalingPolicy']['capacity']['max'])
        data = yaml.dump(processed_yaml_dict, indent=True)
        print("New yaml:", data)
    except yaml.YAMLError as exc:
        print(exc)


if __name__ == "__main__":
    main()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文