您对 Microsoft Oslo MGraph 感觉如何?
MGraph是微软“Oslo”带来的一种很棒的文本数据格式。
您认为它有机会像今天的 XML 一样广泛吗?
示例(Google 地理编码):
{
name = "waltrop, lehmstr 1d",
Status {
code = 200,
request: "geocode"
},
Placemark [
{
id = "p1",
address = "Lehmstraße, 45731 Waltrop, Deutschland",
AddressDetails { Country {CountryNameCode = "DE", CountryName = "Deutschland", AdministrativeArea { AdministrativeAreaName = "Nordrhein-Westfalen", SubAdministrativeArea = { SubAdministrativeAreaName = "Recklinghausen", Locality { LocalityName = "Waltrop", Thoroughfare { ThoroughfareName = "Lehmstraße" }, PostalCode = { PostalCodeNumber = "45731" }}}}}, Accuracy = 6 },
ExtendedData {
LatLonBox {
north = 51.6244226,
south = 51.6181274,
east = 7.4046111,
west = 7.3983159
}
},
Point {
coordinates [ 7.4013350, 51.6212620, 0 ]
}
}
]
}
此处的模式信息: Microsoft "Oslo “ MGraph - 下一个 XML?
MGraph is a great textual data format brought by Microsoft "Oslo".
Do you think it has a chance to get as broad as XML is today?
Example (Google Geocode):
{
name = "waltrop, lehmstr 1d",
Status {
code = 200,
request: "geocode"
},
Placemark [
{
id = "p1",
address = "Lehmstraße, 45731 Waltrop, Deutschland",
AddressDetails { Country {CountryNameCode = "DE", CountryName = "Deutschland", AdministrativeArea { AdministrativeAreaName = "Nordrhein-Westfalen", SubAdministrativeArea = { SubAdministrativeAreaName = "Recklinghausen", Locality { LocalityName = "Waltrop", Thoroughfare { ThoroughfareName = "Lehmstraße" }, PostalCode = { PostalCodeNumber = "45731" }}}}}, Accuracy = 6 },
ExtendedData {
LatLonBox {
north = 51.6244226,
south = 51.6181274,
east = 7.4046111,
west = 7.3983159
}
},
Point {
coordinates [ 7.4013350, 51.6212620, 0 ]
}
}
]
}
Mode information here: Microsoft "Oslo" MGraph - the next XML?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我想知道为什么 MGraph 总是与 XML 进行比较,而不是与看起来更相似的 YAML 进行比较。 我们经常重新发明轮子是无知还是盲目?
PS: 这就是 YAML 的样子(除了 JSON 之外,YAML 还提供了自定义数据类型和对节点“p1”的引用):
I wonder why MGraph is always compared to XML instead of YAML which looks much more similar. Is it ignorance or blindness why we regularly reinvent wheels?
P.S: This is how YAML can looks like (without custom data types and references to the node 'p1' which YAML provides in addition to JSON):
以下是 James Clark 的部分想法on M:
“我发现 M 中缺少几个主要的东西,这些东西的缺失对于 M 的数据库应用程序来说可能是可以接受的,但这对于 M 的其他应用程序来说将是一个重大障碍。最基本的是顺序M 有两种类型的复合值,集合和实体,它们都是无序的,无序是无序的,但属性不能有结构化值,但在 XML 中没有办法。例如,子元素的顺序并不重要。对于许多应用程序来说,缺乏对无序数据的支持显然是 XML 的弱点。另一方面,顺序对于其他应用程序来说同样重要。显然,您可以伪造顺序。 M 通过在实体中拥有索引字段等来实现,但它仍然是假的。一个好的建模语言需要以一流的方式支持有序和无序数据。 这个问题可能是最基本的,因为它影响数据模型。
M 似乎薄弱的另一个领域是身份。 在抽象数据模型中,实体具有独立于其字段值的身份。 但类型系统迫使我通过创建复制实体固有身份的人工字段,以类似 SQL 的方式谈论身份。 更糟糕的是,身份的范围是范围,它是平面表。 与此相关的是对层次结构的支持。 图是比树更通用的数据模型,因此我很高兴拥有图而不是树。 但是当我处理树时,我希望能够说该图是一棵树(这相当于指定对图中节点身份的约束),并且我希望能够将其作为树进行操作,特别是我想要分层路径。
XML 的优点之一是它可以处理文档和数据。 这很重要,因为世界并没有整齐地划分为文档和数据。 您拥有包含文档的数据和包含数据的文档。 干净地建模文档所需的关键是混合文本。 您将如何支持 M 中的文档? 缺乏对顺序的支持是这里的一个主要问题,因为有序是文档的规范。
一个相关的问题是 M 和 XML 如何结合在一起。 我相信有一种规范的方法可以将 M 值表示为 XML 文档。 但是如果您有 XML 格式的数据,您如何用 M 来表达它呢? 在许多情况下,您需要将 XML 结构转换为可以清晰地对数据建模的 M 结构。 但您可能并不总是愿意花时间这样做,并且如果您的 XML 具有类似文档的内容,它会变得很难看。 您最好将 XML 块表示为 M 中的简单值(就像在 JSON 世界中一样,您经常会得到包含 HTML 块的字符串)。 M 应该让这变得容易。 您可以使用 RELAX NG 优雅地解决这个问题(我知道鉴于 Microsoft 对 XSD 的承诺,这不会发生,但这是一个有趣的思想实验):提供一个函数,允许您约束一个简单的值来匹配所表达的 RELAX NG 模式使用紧凑语法(可能会调整紧凑语法以与 M 的其余语法协调一致)并使用 M 的简单类型库作为 RELAX NG 数据类型库。
最后,还有标准化问题。 在我看来,XML 的成就主要不是技术上的成就。 这是一种社会性的:让大量社区同意使用一种通用格式。 标准化是达成这一协议的关键因素。 XML 不会作为单一供应商格式出现在任何地方。 令人震惊的是,PDC 上关于奥斯陆的讨论多次提到了开源,以及微软如何将规范置于其开放规范承诺之下,以实现开源实现,但没有提及标准化。 我可以理解这一点:如果我是微软,我当然不会热衷于重复XSD或OOXML的经历。 但开源并不能替代标准化。
"
请阅读此处 James Clark 关于Oslo 建模语言的博客文章。
Here are part of James Clark's thoughts on M:
" I see several major things missing in M, whose absence might be acceptable for a database application of M, but which would be a significant barrier for other applications of M. Most fundamental is order. M has two types of compound value, collections and entities, and they are both unordered. In XML, unordered is the poor relation of ordered. Attributes are unordered, but attributes cannot have structured values. Elements have structure but there's no way in the instance to say that the order of child elements is not significant. The lack of support for unordered data is clearly a weakness of XML for many applications. On the other hand, order is equally crucial for other applications. Obviously, you can fake order in M by having index fields in entities and such like. But it's still faking it. A good modeling language needs to support both ordered and unordered data in a first class way. This issue is perhaps the most fundamental because it affects the data model.
Another area where M seems weak is identity. In the abstract data model, entities have identity independently of the values of their fields. But the type system forces me to talk about identity in an SQL-like way by creating artificial fields that duplicate the inherent identity of the entity. Worse, scopes for identity are extents, which are flat tables. Related to this is support for hierarchy. A graph is a more general data model than a tree, so I am happy to have graphs rather than trees. But when I am dealing with trees, I want to be able to say that the graph is a tree (which amounts to specifying constraints on the identity of nodes in the graph), and I want to be able to operate on it as a tree, in particular I want hierarchical paths.
One of the strengths of XML is that it handles both documents and data. This is important because the world doesn't neatly divide into documents and data. You have data that contains documents and document that contain data. The key thing you need to model documents cleanly is mixed text. How are you going to support documents in M? The lack of support for order is a major problem here, because ordered is the norm for documents.
A related issue is how M and XML fit together. I believe there's a canonical way to represent an M value as an XML document. But if you have data that's in XML how do you express it in M? In many cases, you will want to translate your XML structure into an M structure that cleanly models your data. But you might not always want to take the time to do that, and if your XML has document-like content, it is going to get ugly. You might be better off representing chunks of XML as simple values in M (just as in the JSON world, you often get strings containing chunks of HTML). M should make this easy. You could solve this elegantly with RELAX NG (I know this isn't going to happen given Microsoft's commitment to XSD, but it's an interesting thought experiment): provide a function that allows you to constrain a simple value to match a RELAX NG pattern expressed in the compact syntax (with the compact syntax perhaps tweaked to harmonize with the rest of M's syntax) and use M's repertoire of simple types as a RELAX NG datatype library.
Finally, there's the issue of standardization. The achievement of XML in my mind isn't primarily a technical one. It's a social one: getting a huge range of communities to agree to use a common format. Standardization was the critical factor in getting that agreement. XML would not have gone anywhere as a single vendor format. It was striking that the talks about Oslo at the PDC made several mentions of open source, and how Microsoft was putting the spec under its Open Specification Promise so as to enable open source implementations, but no mentions of standardization. I can understand this: if I was Microsoft, I certainly wouldn't be keen to repeat the XSD or OOXML experience. But open source is not a substitute for standardization.
"
Read here James Clark's blog article on the Oslo Modelling language.
我情不自禁,但我有点觉得奥斯陆是一个寻找真正优秀的具体问题来解决的解决方案。 我真心希望他们能找到它。
我还觉得他们需要一些有趣的东西来充实今年的 PDC。
I can't help it, but I kinda feel Oslo is a solution looking for a really excellent concrete problem to solve. I truly hope they find it.
I also got that feeling that they needed something fun to pad out this years PDC with.
回应 James Clark 对 M 的看法:
我也看到 M 和 Oslo 缺少一些东西,但又不完全相同。
如果能够保证 M 能够保留集合中实体的保留顺序,那就太好了。 但是,如何对元素进行排序是一个实现细节。 如果 M 中有一个有序集合,并且将其保存到数据库中,那么如何维护它们的顺序呢? 唯一的方法是对数据的形状做出一些假设,向您未指定的表添加一些列,在这种情况下,完全控制数据结构的形状更有意义。
身份也是如此。 我们在内存中拥有对象标识的原因是因为每个对象在内存中分配不同的位置,并具有该内存地址来唯一标识它。 然而,当保存到数据库时,此信息不再相关,并且您需要某些列或列组合来唯一标识该记录,以作为其主键。 如果您没有指定它,那么 M 必须为您发明一个列,并且您将无法引用它,除非通过某种可能难以发现的技巧。 换句话说,不存在“固有的身份”; 总有一些数据可以明确识别它。
文档和数据不是两个不同的东西。 XML 本身不处理文档; 它只是代表分层数据,文档就是由它组成的。 只要数据是结构化的,就可以用 M 表示,就像您可以为层次结构的各个部分编写类并从另一种类型引用一种类型以将它们组成任意复杂的树一样。 诚然,这在 XML 中更容易组合在一起,因为它是自由格式的文本,并且除非您编写 XSD 架构,否则没有真正的验证,但在这些情况下,您所做的工作与在代码类中定义类型和关系相同。 。
因此,最终,M 处理您为其定义结构的文档,并且该结构实际上没有任何限制。 问题是这样做有多容易。 您对于一个工具来分解 XML 文档并生成 M 模式的想法是一个非常好的想法。 我想编写一个工具并不会太困难,或者一旦它成熟一点,微软将其纳入他们的工具链中。 就结构“变得丑陋”而言,如果您的数据结构真的那么复杂,那就是这样。 对其进行图表化具有很大的优势,在 XSD 或 M 或 C# 类中也是如此,但如果您的目标是将其存储在 SQL Server 数据库(或特别是 Oslo 存储库)中,那么它是必要且值得的。
我非常有信心 M 和支持工具链将发展成为非常令人惊奇和有用的东西。 现在显然还缺少很多东西。 就我个人而言,我更关心这样一个事实:M 目前的目标是在关系、物理数据库级别而不是概念级别(如实体框架)进行建模,开发人员在概念级别开始建模感觉最自然。 毕竟,当编写类来实例化 MGraph 中的对象(DSL 的目的和输出)时,类的定义可能与它们的持久化方式完全不同。 特别是当您在模型中使用继承时。
我同意你关于标准化的观点。 那样就好了。 然而,我认为它不太重要,因为目标是将这些数据存储在奥斯陆存储库中。 特别是一旦 SQL 数据服务足够成熟以托管存储库,我们将拥有所有不同的协议和格式来查询和操作此数据。 客户端将能够通过 ADO.NET 数据服务进行查询和更新,使用 JSON、POX、SOAP、MGraph 等格式化消息。 所有 MGraph 数据都需要一个 MGraph 连接器来将其获取到数据库中,并可以通过任何可以想象的方式对其进行访问。
您可以在我的文章中找到有关奥斯陆的更多信息:
http://dvanderboom.wordpress.com/2009/01 /17/为什么奥斯陆很重要/
In response to James Clark's thoughts on M:
I also see some things missing from M and Oslo, but not quite the same things.
It would be nice to have some guaranty that M would preserve the order that entities within collections are preserved. However, how you want to order elements is an implementation detail. If you have an ordered collection in M and you persist that to a database, how do you maintain their order there? The only way would be to make some assumptions about the shape of the data, to add some column to a table that you didn't specify, and in that case it makes more sense to be in full control of your data structure's shape.
The same goes for identity. The reason we have object identity in memory is because each object allocates a different place in memory, and has that memory address to uniquely identity it. When saved to a database, however, this information is no longer relevant, and you need some column or combination of columns to uniquely identify that record, to serve as its primary key. If you don't specify it, then M has to invent a column for you and you won't have a reference to it, except perhaps through some kind of trick that may be difficult to discover. In other words, there is no "inherent identity"; there's always some data that explicitly identifies it.
Documents and data aren't two different things. XML doesn't handle documents per se; it just represents hierarchical data, and documents are composed from this. As long as the data is structured, it can be represented in M, in the same way that you can write classes for the various parts of the hierarchy and reference one type from another to compose them into arbitrarily-complex trees. Admittedly, this is easier to throw together in XML because it's free-form text and there's no real validation unless you write an XSD schema, but in those cases, you're doing the same kind of work as defining types and relations in code classes.
So ultimately, M handles documents that you define the structure for, and that structure doesn't really have any limitations. The question is how easy is to do so. The idea you have for a tool to pull apart an XML document and generate M schema is a pretty good one. I imagine it wouldn't be too difficult to write one, or for Microsoft to include with their tool chain once it matures a bit more. As far as the structure "getting ugly" goes, if your data structure is really that complex, it is what it is. Schematizing it has great advantages, same in XSD or M or C# classes, but if your goal is to store it in a SQL Server database (or the Oslo Repository specifically), then it's necessary and worthwhile.
I'm pretty confident that M and the supporting tool chain will evolve into something pretty amazing and useful. There's obviously a lot missing right now. Personally, I'm more concerned with the fact that M is currently targeted at modeling at the relational, physical database level instead of the conceptual level (like Entity Framework), where it feels most natural for a developer to begin modeling. After all, when writing classes to instantiate objects from MGraphs (the purpose and output for a DSL), your classes may be defined quite differently from how they are persisted. Especially if you use inheritance in your models.
I agree with you on standardization. That would be nice. However, I think it's less important due to the fact that the goal is to store this data in the Oslo repository. Especially once SQL Data Services is mature enough to host the repository, we're going to have all different protocols and formats for querying and manipulating this data. Clients will be able to query and update via ADO.NET Data Services, formatting messages with JSON, POX, SOAP, MGraph, and so on. All MGraph data needs is an MGraph connector to get it in the database, from which it can be accessed in any way imaginable.
You can find more information about Oslo in my article here:
http://dvanderboom.wordpress.com/2009/01/17/why-oslo-is-important/
...它能做什么而 JSON 不能做什么?
...and what does it do that JSON doesn't do?