需要分层文本数据结构解析器建议

发布于 2024-11-04 03:40:09 字数 1435 浏览 1 评论 0原文

拥有以下分层文本数据输入(事实上,类似于 JunOS),我需要将其解析为一些合适的数据结构 执行查询以获得树的某些用户指定的分支,然后将其线性化(?)到某种映射,我可以使用它来让用户更改/插入/删除等,然后将其作为树写回输出文件再次(将原始数据存储在“版本”文件中,以允许以后的“历史”或“回滚”操作 - 正如前面描述的全套操作)。

version 1.0;
description "Example data";

weights {
    weight low {
        value 1;
        description Forgetable;
    }
    weight medium {
        value 2;
        description Important;
    }
    weight high {
        value 3;
        description Critical;
    }
}

tags {
    tag foo {
        description "Some foo";
    }
    tag bar {
        description "Some bar";
    }
    tag baz {
        description "Some baz";
    }
}

tag-sets {
    tag-set foo\ bar {
        tag [ foo bar ];
        description Foo\ and\ bar;
    }
    tag-set "foo bar baz" {
        tag-set "foo bar";
        tag baz;
        description "Foo, bar and baz";
    }
}

问题:

1)哪种数据结构最适合输入?您建议使用哪种 C 结构?

2)我不想使用 yacc/lex 来解析它(不必要的额外步骤和复杂的协作工作,而不是每个人 - 甚至我 - 喜欢/知道使用这些工具) - 对于这种类型,哪种解析方法最容易实现解析问题?

3)您建议使用什么方法来维护源代码中节点的“类型”?这似乎很棘手 我现在(事实上我还不知道该怎么做)。例如,有一些类型为“version”的节点,它采用一些“word”作为参数。据了解, 节点“版本”仅作为层次结构根分支的一部分存在。另一个例子可能是有几个“描述”节点采用“单词”或“字符串” 作为他们的论点。 “描述”节点属于层次结构的每个节点。 ETC。 遇到此类问题该如何应对呢?

请注意解释目的:生成的实用程序将对存储在文本数据文件中的一些数据非常相似进行“版本化” 对于我上面提供的示例,用户将查询/更改/插入/删除数据 维护某种特定信息(例如,待办事项列表或其他信息)。将其视为简单的数据库而不是配置文件或类似的东西(对不起我的英语)。这个想法是提供 a) CLI,b) 命令行工具,c) 允许 用户在编辑器中编辑数据,如果不想使用 a) 或 b)...

至少一些“一般”建议值得高度赞赏。

Having the following hierarchical text data input (JunOS-like, in fact) I need to parse it into some suitable data structure I could
perform queries to obtain some user-specified branch of the tree, then linearize it (?) to some sort of mapping I could use to let user change/insert/delete etc. it and then write it back to an output file as a tree again (storing the original data in a "version" file to allow later "history" or "rollback" operations - the full set of operations as described some words ago).

version 1.0;
description "Example data";

weights {
    weight low {
        value 1;
        description Forgetable;
    }
    weight medium {
        value 2;
        description Important;
    }
    weight high {
        value 3;
        description Critical;
    }
}

tags {
    tag foo {
        description "Some foo";
    }
    tag bar {
        description "Some bar";
    }
    tag baz {
        description "Some baz";
    }
}

tag-sets {
    tag-set foo\ bar {
        tag [ foo bar ];
        description Foo\ and\ bar;
    }
    tag-set "foo bar baz" {
        tag-set "foo bar";
        tag baz;
        description "Foo, bar and baz";
    }
}

Questions:

1) What data structure suites the input the best? What C structure do you suggest to be used?

2) I do not want to use yacc/lex to parse it (unnecessary extra steps and complicated collaborative work whilst not everybody - even me - likes/knows to use the tools) - what parsing method is the easiest to implement for such sort of parsing problem?

3) What method do you suggest to maintain the "types" of nodes in source code? It seems quite tricky to
me at the moment (in fact I have no idea how to do it yet). For instance there is some node of type "version" that takes some "word" as it's argument. It is also known that
the node "version" exists only as part of the root branch of the hierarchy. Another example may be that there are several "description" nodes taking a "word" or a "string
as their arguments. The "description" nodes belongs to every node of the hierarchy. Etc.
How to cope with this sort of problem?

Note to explain the purpose: The resulting utility will "version" some data stored in text data files quite similar
to the example I provided above and user will query/change/insert/delete the data to
maintain some sort of specific information (say, todo list or whatever, as an example). Consider it to be sort of simple database rather than configuration file or something alike (sorry my english). The idea is to provide a) CLI, b) command-line tool, c) allow
users to edit data in their editor, if the do not want to use a) or b)...

At least some "general" suggestions are to be highly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

许一世地老天荒 2024-11-11 03:40:09

我将使用递归下降解析器与某种哈希表或映射相结合来进行数据存储。从外观上看,它与 JSON 非常相似,但又不完全相同。不过,似乎支持字符串、数字、列表和字典。一个简单的“Object”类型类就可以完成存储它的技巧(类似于 javascript)。

为了管理数据结构的历史记录,您可以像 OMeta 世界一样实现它(请参阅:http:// www.vpri.org/pdf/rn2008001_worlds.pdf)。它利用原型对象模型来管理范围和历史记录。

I would use a recursive descent parser combined with some sort of hashtable or map for data storage. From the looks of it, it closely resembles JSON, but not exactly. Strings, Numbers, Lists, and Dictionaries seem to be supported though. A simple "Object" type class would do the trick for storing that (similar to javascript).

For managing history of the data structure, you could implement it similar to OMeta worlds (see: http://www.vpri.org/pdf/rn2008001_worlds.pdf). It leverages prototypical object model for managing scope and history.

终止放荡 2024-11-11 03:40:09

您可以从 json 解析器(例如 json 解析器)开始并进行相应修改。

You could start with a json parser such as the json parser and modify accordingly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文