如何建模简单的嵌套数据静态结构而不是嵌套字典（namedtuple、dataclass...）

发布于 2025-01-11 05:41:27 字数 1911 浏览 5 评论 0原文

我希望有一些最佳实践或设计模式，用于具有简单嵌套数据/映射的用例，需要预存储数据、选择适当的数据组并方便地访问特定值。示例 - 安装脚本参数，其中安装脚本将被指定安装类型（也称为标签），并基于要安装的正确文件夹和应评估的其他参数值

示例作为嵌套字典

install_params = {'prod':{'folder_name':'production','param_1' : value_a ... }, 
                  'dev':{'folder_name':'development','param_1' : value_b ... },
                  'test':{'folder_name':'test', 'param_1' : value_c ... }}

嵌套字典提供父字典 install_params 下所有数据的分组，因此所有标签可以使用 install_params.keys() 呈现给用户，并且（在这种情况下）唯一必需的功能 - 检索给定安装类型/标签的正确值（只需通过 install_params[user_selected_label] ）然而，如果父字典中有大量项目，或者更多嵌套级别，或者更复杂的结构作为参数值，这将使用起来很麻烦并且容易出错。

考虑将其重写为 collection.namedtuple 或 typing.NamedTuple 或 @dataclass 或只是通用 class 但两者都不是似乎为上述通用用例提供了方便的解决方案。

例如，命名元组提供点表示法来访问字段，但我仍然必须以某种方式将它们全部分组并实现基于

prod = InstallType( 'prod', 'prod', 'value_a')
dev = InstallType( 'dev', 'development ', 'value_b')
test = InstallType( 'test', 'test ', 'value_c')

all_types = [prod,dev, test]
if type =='prod': # select prod named tuple

最终选择正确实例的功能，我开始考虑创建命名元组的字典或单个数据类来保留基于标签输入参数的数据映射私有和初始化本身。后者很好地封装了所有功能并允许点表示法，但仍然在内部使用丑陋的嵌套字典

@dataclass(frozen=True)
class SomeDataClass:
    label: str
    folder_name: str = None
    param_1: float = None

    def __post__init__(self):
        #dreaded nested_dict to populate self.folder_name and others based on submitted label
        (...)   

install_params = SomeDataClass(label=user_selection)
print(install_params.folder_name) # would work as expected since coded in SomeDataClass

这似乎很有希望，但 SomeDataClass 再次产生了进一步的问题（How to Exposure all available install labels (prod, dev, test...) to user before）实例化类，如何将字段呈现为必填字段，但不要求用户提交它们，因为它们应该根据提交的标签进行“计算”（普通的 folder_name: str 要求用户仍然提交它，尽管folder_name : str = None 令人困惑，并建议其可选，可以设置为 None ）等...）对于简单的嵌套字典映射来说，它也感觉是相当过度设计的解决方案

原文

I hope there is some best practice or design pattern for use case of having simple nested data/mapping with need to pre-store data, select proper data group, and conveniently access specific values. Example - install script parameters where install script will be given type of install (a.k.a label) and based on that proper folder to install and other params values should be evaluated

Sample as nested dict

install_params = {'prod':{'folder_name':'production','param_1' : value_a ... }, 
                  'dev':{'folder_name':'development','param_1' : value_b ... },
                  'test':{'folder_name':'test', 'param_1' : value_c ... }}

Nested dicts offers grouping of all data under parent dict install_params so all labels can be present to user with install_params.keys(), and (in this case) sole required functionality - retrieving proper values for given installation type/label (simply via install_params[user_selected_label] )
However having either numerous items in parent dict, or more nested levels, or more complex structures as values of params, this will be cumbersome to use and error prone.

Thought of rewriting it as either collection.namedtuple or typing.NamedTuple or @dataclass or just generic class but neither seems to offer convenient solution to the generic use case above.

e.g. named tuple offers dot notation to access fields, but I still have to group them all somehow and implement functionality of selecting proper instance based

prod = InstallType( 'prod', 'prod', 'value_a')
dev = InstallType( 'dev', 'development ', 'value_b')
test = InstallType( 'test', 'test ', 'value_c')

all_types = [prod,dev, test]
if type =='prod': # select prod named tuple

in the end I've started to think to create dictionary of named tuples or a single data class that would keep the data mapping private and init itself based on label input parameter. The later would nicely encapsulate all functionality and allow dot notation, but still uses ugly nested dicts inside

@dataclass(frozen=True)
class SomeDataClass:
    label: str
    folder_name: str = None
    param_1: float = None

    def __post__init__(self):
        #dreaded nested_dict to populate self.folder_name and others based on submitted label
        (...)   

install_params = SomeDataClass(label=user_selection)
print(install_params.folder_name) # would work as expected since coded in SomeDataClass

This seemed promising, but SomeDataClass again produced further problems (How to expose all available install labels (prod, dev, test...) to user before instancing the class, how to presents fields as mandatory, but not requiring user to submit them, as they should be 'computed' based on submitted label. ( plain folder_name: str requires user to still submit it, while folder_name : str = None is confusing and suggest its optional and can be set to None ), etc... ) It also feels rather over-engineered solution for the simple nested dicts mappings

分享到QQ

分享到微博