Python 最好支持哪种结构化文本格式?
这个问题可能被认为是主观的,但我想问一下 SO 用户,Python 最好支持哪种常见的结构化文本数据格式。
我最初的选择是:
- XML
- JSON
- 和 YAML
这三个中哪一个最容易在 Python 中使用(即具有最好的库支持/性能)...或者是否有另一种我没有提到的格式得到更好的支持Python。
我不能使用仅 Python 的格式(例如 Pickling),因为互操作非常重要,但处理这些文件的大部分代码将用 Python 编写,因此我热衷于使用 Python 中支持最强大的格式。
CSV 或固定列文本也可能适用于大多数用例,但我更喜欢更具可扩展性的格式的灵活性。
谢谢
注意
关于互操作,我最初将使用 Builder 从 Ruby 生成这些文件,但是 Ruby 不会再次使用这些文件。
This question may be seen as subjective, but I'd like to ask SO users which common structured textual data format is best supported in Python.
My initial choices are:
- XML
- JSON
- and YAML
Which of these three is easiest to work with in Python (ie. has the best library support / performance) ... or is there another format that I haven't mentioned that is better supported in Python.
I cannot use a Python only format (e.g. Pickling) since interop is quite important, but the majority of the code that handles these files will be written in Python, so I'm keen to use a format that has the strongest support in Python.
CSV or fixed column text may also be viable for most use cases, however I'd prefer the flexibility of a more scalable format.
Thank you
Note
Regarding interop I will be generating these files initially from Ruby, using Builder
, however Ruby will not be consuming these files again.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会选择 JSON,我的意思是 YAML 很棒,但与它的互操作并不是那么好。
XML 看起来简直是一团乱七八糟的东西,而且内容太多。
Python 从 2.6 版本开始就有一个内置 JSON 模块。
I would go with JSON, I mean YAML is awesome but interop with it is not that great.
XML is just an ugly mess to look at and has too much fat.
Python has a built-in JSON module since version 2.6.
JSON 具有强大的 Python 支持,并且比 XML 更紧凑(如果您只是尝试转储和加载对象,API 通常更方便)。据我所知,YAML 没有开箱即用的支持,尽管我还没有真正检查过。在摘要中,我建议使用 JSON,因为该格式的开销较低且支持广泛的语言,但它确实在一定程度上取决于您的应用程序 - 如果您在一个已经建立了应用程序的空间中工作,那么格式他们使用的可能是更好的,即使他们在技术上有缺陷。
JSON has great python support and it is much more compact than XML (and the API is generally more convenient if you're just trying to dump and load objects). There's no out of the box support for YAML that I know of, although I haven't really checked. In the abstract I would suggest using JSON due to the low overhead of the format and the wide range of language support, but it does depend a bit on your application - if you're working in a space that already has established applications, the formats they use might be preferable, even if they're technically deficient.
我认为这在很大程度上取决于您需要如何处理数据。如果您要构建一个复杂的数据库并对其进行处理和转换,我怀疑您最好使用 XML。我发现
lxml
模块在这方面非常有用。它完全支持 xpath 和 xslt 等标准,并且这种支持是在本机代码中实现的,因此您将获得良好的性能。但如果您正在做更简单的事情,那么您可能最好使用更简单的格式,例如 yaml 或 json。我听说过“json 转换”,但不知道这项技术有多成熟,也不知道 Python 对它的访问有多发达。
I think it depends a lot on what you need to do with the data. If you're going to be building a complex database and doing processing and transformations on it, I suspect you'd be better off with XML. I've found the
lxml
module pretty useful in this regard. It has full support for standards like xpath and xslt, and this support is implemented in native code so you'll get good performance.But if you're doing something more simple, then likely you'd be better off to use a simpler format like yaml or json. I've heard tell of "json transforms" but don't know how mature the technology is or how developed Python's access to it is.
这三者几乎都是一样的。使用更容易互操作的那个。
It's pretty much all the same, out of those three. Use whichever is easier to inter-operate with.