XML 元素名称
我需要为我的公司重新定义 XML 文档和架构。例如,相关文档分为多个部分,每个部分都包含有关药物的信息;
<dosage>overview of dose info
<elderly>doses for elderly patients</elderly>
<children>doses for children</children>
</dosage>
<administration>info about administering the med...</administration>
我坚信应该更改元素名称以反映元素的内容,例如 以及描述内容的属性:
我的想法是否正确?任何人都可以提供他们在实践中发现有用的元素命名法的指导原则吗?
I need to redefine an XML document and schema for my company. The document in question is split into a number of sections that each contain information about a medication, for example;
<dosage>overview of dose info
<elderly>doses for elderly patients</elderly>
<children>doses for children</children>
</dosage>
<administration>info about administering the med...</administration>
I strongly believe that the element names should be changed to reflect what the element is eg <section>
with an attribute describing the content: <section displayName='dosage'>
. Not all of my colleagues agree.
Is my thinking correct and can anyone provide guiding principles for element nomenclature that they have found useful in practice?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
考虑
老人
和儿童
的情况。标签应该定义它是什么——在这种情况下,它们都是特定于某种类型的人的剂量说明。但是使用children
和elderly
并不能传达这些信息——那里没有任何关系。相反,如果是...
,则维持该关系。两者都是针对不同目标
的指令
。对于
剂量
和给药
部分,这两个部分都可以被视为药物的属性。您在此处执行的操作取决于整个文档的结构及其解析方式。在我看来,剂量与给药非常不同。如果您将其定义为 OOL 中的对象,您将拥有:这两个都是不同的属性,并且它们之间没有真正的相似之处(好吧,除了它们都是药物的属性之外)。我认为对它进行抽象是没有用的,但根据整个文档的结构及其使用方式,可以以任何方式争论它。
例如,如果您要打印一组不同属性的键值对列表(例如,一个键是管理,该值是信息),那么这就是要走的路。但是剂量与管理具有不同的结构,因此我认为该特定抽象没有用。如果每种药物都有一组固定的可能属性(剂量、给药信息等),并且都会以不同的方式进行处理,那么在我看来,为所有药物使用不同的标签是合乎逻辑的。
就一般指导原则而言,我通常会思考“如何将该文档定义为对象”,然后考虑该对象的 XML 序列化是什么。这对我有用,因为我更习惯于处理对象,但你的情况可能会有所不同。当然,在某些情况下,这不是最好的方法 - 例如,如果您真正表示一个文档,例如 HTML,那么这不是正确的方法。但如果您使用 XML 来定义常规数据结构,那么它通常应该可以工作。
Consider the case of
elderly
andchildren
. The tag should define what it is -- in this case they are both dosage instructions specific to a certain type of person. But usingchildren
andelderly
doesn't communicate this information -- there's no relationship there. If instead it were<instructions target="elderly">...</instructions>
, that relationship is maintained. Both areinstructions
for differenttargets
.For the
dosage
andadministration
sections, both of those could be considered to be properties of the medication. What you do here depends on the structure of the whole document and how it will be parsed. It seems to me thatdosage
is very distinct fromadministration
. If you were defining this as an object in an OOL, you would have:Both of these are different properties, and there's no real parallel between them (well, other than that they're both properties of the medication). I don't think it would be useful to abstract that any more than it already is, but it's something that could be argued either way based on the structure of the entire document and how it's going to be used.
For example, if you are going to print out a list of key-value pairs, (for example, one key is
administration
and that value is the info) for a bunch of different properties, then that's the way to go. Butdosage
has a distinct structure fromadministration
, so I don't think that that particular abstraction would be useful. If every medication has a fixed set of possible properties (dosage, administration info, etc) that will all be treated differently, then in my opinion it would be logical to use distinct tags for all of them.As far as general guiding principles, I generally think "how would I define this document as an object," then consider what the XML serialization of that object would be. This works for me because I'm far more used to working with objects, but your mileage may vary. And there are certainly cases where that's not the best approach -- for example, if you're truly representing a document, like HTML, then that's not the way to go. But if you're using XML to define a regular data structure, it should generally work.
我发现,通常按照您提供的示例中的方式定义 XML 会更清晰一些。
作为您提出的命名法的一个极端例子,您最终可能会得到这样的结果:
当然,最终这一切都取决于特定的应用程序,但通常我会尝试将现实世界中的实体和属性尽可能多地抽象为 XML需要,但不多了。
所以在这个例子中,“section”元素是一个过度抽象的元素。
I have found it that generally it is a bit clearer to have the XML defined as in the example you provided.
As an extreme example of your proposed nomenclature you could end up with this:
Of course, in the end it all depends on the specific application, but generally I would try to abstract enities and properties from the real world to XML as much as it is needed, but not more.
So in this example "section" element is an overabstraction.
我认为这有点远了。我遵循一个规则,脱离上下文是否具有语义意义?该部分可能在脱离上下文的情况下有意义,但您知道您正在丢失相关的语义信息。那么我们需要了解什么呢?它包含剂量信息。那么也许 dosageinfo 会更好?
按照针对老人和儿童的相同方法,我们假设这些元素代表老人和儿童。嗯……不是真的。如果他们的名字反映了他们所做的事情,他们会更像:
也就是说,这肯定不是一个正式的方法 - 我从未真正见过提出的正式方法。
虽然我在这里并且在以各种方式处理临床数据方面拥有丰富的经验,但我还建议您尝试将一些自由文本转换为形式化的 XML 数据,即使您必须使用自然语言解析来收集一些它。任何形式化的数据,甚至是人工智能收集的数据,只要其正确表示,都可以使将来的信息查询变得更加容易。它可能与您的场景无关,但我觉得值得考虑。
自由文本数据仅作为信息有用。关系中的数据是数据和信息。
I think that's going a bit far. I follow a rule of, does it make semantic sense out of context? Section might make sense out of context but you know you're losing semantic information that is relevant. So what do we need to know about it? That it contains doosage information. So perhaps dosageinfo would be better?
Following the same approach for elderly and children we would assume these elements represent elderly people and children. Um... not really. If their names reflect what they do, they'd be something more like:
That said, this is certainly not a formal method - I've never actually seen a formal method proposed.
Whilst I'm here, and having significant experience with handling clinical data in various ways, I'd also suggest you try and get some of your free text into formalised XML data, even if you have to use Natural Language Parsing to glean some of it. Any formalised data, even AI-gleaned data so long as its properly represented as such, can make querying the information much easier in future. It might not be relevant to your scenario, but I feel it's worth considering.
Data in free text is only useful as information. Data in relationships is data and information.