用于“表格数据”的 XML 或 CSV
我有“表格数据”要从服务器发送到客户端 --- 我正在分析应该使用 CSV 类型的甲酸盐还是 XML。
我发送的数据可以以 MB 为单位,服务器将对其进行流式传输,客户端将逐行读取它以开始对输出进行匹配(客户端无法等待所有数据到来)。
根据我目前的想法,CSV 会很好——它会减少数据大小并且可以更快地解析。
XML 是一个标准——我关心的是解析数据,因为它涉及到系统(实时解析)和数据大小。
最好的解决方案是什么?
感谢所有宝贵的建议。
I have "Tabular Data" to be sent from server to client --- I am analyzing should I be going for CSV kind of formate or XML.
The data which I send can be in MB's, server will be streaming it and client will read it line by line to start paring the output as it gets (client can't wait for all data to come).
As per my present thought CSV would be good --- it will reduce the data size and can be parsed faster.
XML is a standard -- I am concerned with parsing data as it comes to system(live parsing) and data size.
What would be the best solution?
thanks for all valuable suggestions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果是“表格数据”并且表格相对固定且规则,我会选择 CSV 格式。特别是如果它是一个服务器和一个客户端。
如果您有多个客户端并且希望在使用数据之前验证文件格式,那么 XML 具有一些优势。另一方面,XML由于“代码膨胀”而垄断了市场,因此传输的数量将会大大。
If it is "Tabular data" and the table is relatively fixed and regular, I would go for a CSV-format. Especially if it is one server and one client.
XML has some advantage if you have multiple clients and want to validate the file format before using the data. On the other hand, XML has cornered the market for "code bloat", so the amount transfered will be much larger.
我会使用 CSV,其标题指示每个字段的 ID。
只要您没有忘记标头,即使数据格式发生变化也应该没问题。当然,在服务器开始发送新流之前,客户端需要进行更新。
如果不是所有客户端都可以轻松更新,那么您需要更宽松的消息传递系统。
Google Protocol Buffer 专为此类向后/向前兼容性问题而设计,并将其与出色(快速且紧凑)的二进制编码能力相结合,以减少消息大小。
如果你这样做,那么这个想法很简单:每条消息代表一行。如果你想流式传输它们,你需要一个简单的“消息大小 | 消息 blob”结构。
就我个人而言,我一直认为 XML 的设计过于臃肿。如果您使用人类可读格式,那么至少选择 JSON,您将把标签开销减少一半。
I would use CSV, with a header which indicate the id of each field.
As long as you do not forget the header, you should be fine if the data format ever changes. Of course the client need be updated before the server starts sending new streams.
If not all clients can be updated easily, then you need a more lenient messaging system.
Google Protocol Buffer has been designed for this kind of backward/forward compatibility issues, and combines this with excellent (fast & compact) binary encoding abilities to reduce the message sizes.
If you go with this, then the idea is simple: each message represents a line. If you want to stream them, you need a simple "message size | message blob" structure.
Personally, I have always considered XML bloated by design. If you ever go with Human Readable formats, then at least select JSON, you'll cut down the tag overhead by half.
我建议你选择 XML。
有很多库可用于解析。
此外,如果以后数据格式发生变化,XML情况下的解析逻辑不会改变,只有业务逻辑可能需要改变。
但如果是 CSV 解析逻辑可能需要更改
I would suggest you go for XML.
There are plenty of libraries available for parsing.
Moreover, if later the data format changes, the parsing logic in case of XML won't change only business logic may need change.
But in case of CSV parsing logic might need a change
CSV 格式会更小,因为您只需在第一行上指定标题,然后在下面的数据行之间仅用逗号分隔即可将任何额外字符添加到流大小中。
CSV format will be smaller since you only have to delare the headers on the first row then rows of data below with only commas in between to add any extra characters to the stream size.