这是“不好的做法”吗? 对 XML 文档中的换行符敏感吗?
我正在生成一些 XML 文档,当涉及地址部分时,我有如下所示的片段:
<Address>15 Sample St
Example Bay
Some Country</Address>
用于将其转换为 XHTML 的 XSLT 有一些时髦的递归模板,可将字符串中的换行符转换为
; 标签。
这一切都工作正常; 但是依赖 XML 文档中的换行符是否被认为是“不好的做法”? 如果是这样,是否建议我这样做?
<Address><Line>15 Sample St</Line>
<Line>Example Bay</Line>
<Line>Some Country</Line></Address>
似乎用这样的标签包装我的文本可能是多行的每个地方真的很尴尬。
I'm generating some XML documents and when it comes to the address part I have fragments that look like this:
<Address>15 Sample St
Example Bay
Some Country</Address>
The XSLT that I have for converting this to XHTML has some funky recursive template to convert newline characters within strings to <br/> tags.
This is all working fine; but is it considered "bad practice" to rely on linebreaks within XML documents? If so, is it recommended that I do this instead?
<Address><Line>15 Sample St</Line>
<Line>Example Bay</Line>
<Line>Some Country</Line></Address>
Seems like it'd be really awkward to wrap every place where my text may be multiple lines with tags like that..
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(12)
我不明白
标签有什么问题。显然,数据的可视化对您来说很重要,重要到足以将其保留在数据中(通过第一个示例中的换行符)。 美好的。 然后真的保留它,不要依靠“魔法”来为你保留它。 保留稍后需要的每一位数据,并且无法从数据的保存部分完美推断,即使它是可视化数据(换行符和其他格式),也保留它。 您的用户(另一位开发人员的最终用户)花时间根据自己的喜好格式化该数据 - 要么告诉他(API 文档/输入附近的文本)您不打算保留它,或者 - 只是保留它。
I don't see what's wrong with
<Line>
tags.Apparently, the visualization of the data is important to you, important enough to keep it in your data (via line breaks in your first example). Fine. Then really keep it, don't rely on "magic" to keep it for you. Keep every bit of data you'll need later on and can't deduce perfectly from the saved portion of the data, keep it even if it's visualization data (line breaks and other formatting). Your user (end user of another developer) took the time to format that data to his liking - either tell him (API doc / text near the input) that you don't intend on keeping it, or - just keep it.
是的,我认为使用 CDATA 块可以保护空白。 尽管某些解析器 API 允许您保留空格。
Yes, I think using a CDATA block would protect the whitespace. Although some parser APIs allow you to preserve whitespace.
您真正应该做的是将 XML 转换为保留空白的格式。
因此,不要试图将 \n 替换为
> 您应该将整个块包装在
这样,您的地址就可以在功能上得到保留(无论是否包含换行符),并且 XSTL 可以选择是否在结果中保留空格。
What you really should be doing is converting your XML to a format that preserves white-space.
So rather than seek to replace \n with <br /> you should wrap the whole block in a <pre>
That way, your address is functionally preserved (whether you include line breaks or not) and the XSTL can choose whether to preserve white-space in the result.
我建议您应该添加
换行符,或者使用换行符实体 -I recommend you should either add the
<br/>
line breaks or maybe use line-break entity -如果您需要保留换行符,请使用 CDATA 块,如 tweakt said
否则要小心。 大多数情况下,XML 软件会保留换行符,但有时不会,而且您真的不想依赖于巧合的东西
If you need your linebreaks preserved, use a CDATA block, as tweakt said
Otherwise beware. Most of the time, the linebreaks will be preserved by XML software, but sometimes they won't, and you really don't want to be relying on things which only work by coincidence
使用属性而不是文本节点来存储数据怎么样:
我知道属性与文本节点的使用是一个经常争论的话题,但我 95% 的时间都坚持使用属性,并且没有遇到任何麻烦因为它。
What about using attributes to store the data, rather than text nodes:
I know the use of attributes vs. text nodes is an often debated subject, but I've stuck with attributes 95% of the time, and haven't had any troubles because of it.
很少有人说过 CDATA 块将允许您保留换行符。 这是错误的。 CDATA 部分只会将标记作为字符数据进行处理,它们将不会更改换行符处理。
与 完全相同 唯一的
区别是不同的 API 如何报告这一点。
Few people have said that CDATA blocks will allow you to retain line breaks. This is wrong. CDATA sections will only make markup be processed as character data, they will not change line break processing.
is exactly the same as
The only difference is how different APIs report this.
我认为唯一真正的问题是它使 XML 更难阅读。 例如,
如果漂亮的 XML 不是问题,我可能不会担心它,只要它能工作。 如果需要考虑漂亮的 XML,我会将显式换行符转换为
标记或\n
,然后再将它们嵌入 XML 中。I think the only real problem is that it makes the XML harder to read. e.g.
If pretty XML isn't a concern, I'd probably not worry about it, so long as it's working. If pretty XML is a concern, I'd convert the explicit newlines into
<br />
tags or\n
before embedding them in the XML.这取决于您如何读取和写入 XML。
如果 XML 是自动生成的 - 如果换行符或显式 \n 标志被解析为
- 那么就没有什么可担心的。 您的输入中可能没有任何其他 XML,因此根本不弄乱 XML 会更干净。
如果手动使用标签,如果你问我的话,只使用换行符仍然会更干净。
例外情况是您使用 DOM 从 XML 中获取某些结构。 在这种情况下,换行显然是邪恶的,因为它们不能正确代表继承权。 不过,听起来层次结构与您的应用程序无关,因此换行符听起来就足够了。
如果 XML 看起来很糟糕(尤其是自动生成时),Tidy 可以提供帮助,尽管它与 HTML 配合使用效果更好与 XML 相比。
It depends on how you're reading and writing the XML.
If XML is being generated automatically - if newlines or explicit \n flags are being parsed into
- then there's nothing to worry about. Your input likely doesn't have any other XML in it so it's just cleaner to not mess with XML at all.
If tags are being worked with manually, it's still cleaner to just have a line break, if you ask me.
The exception is if you're using DOM to get some structure out of the XML. In that case line breaks are obviously evil because they don't represent the heirarchy properly. It sounds like the heirarchy is irrelevant for your application, though, so line breaks sound sufficient.
If the XML just looks bad (especially when automatically generated), Tidy can help, although it works better with HTML than with XML.
这可能是一个有点欺骗性的例子,因为在这种情况下地址有点非标准化。 然而,这是一个合理的权衡,因为地址字段很难标准化。
如果您让换行符包含重要信息,那么您就没有标准化,并使邮局解释换行符的含义。
我想说,通常这不是一个大问题,但在这种情况下,我认为行标记是最正确的,因为它明确表明您实际上并没有解释这些行在不同文化中的含义。 (请记住,大多数输入地址的表单都有邮政编码等,以及地址行 1 和 2。)
普通 XML 中带有行标记的尴尬之处,并且在编码恐怖方面引起了很多争论。 http://www.codinghorror.com/blog/archives/001139.html
This is probably a bit deceptive example, since address is a bit non-normalized in this case. It is a reasonable trade-off, however since address fields are difficult to normalize.
If you make the line breaks carry important information, you're un-normalizing and making the post office interpret the meaning of the line break.
I would say that normally this is not a big problem, but in this case I think the Line tag is most correct since it explicitly shows that you don't actually interpret what the lines may mean in different cultures. (Remember that most forms for entering an address has zip code etc, and address line 1 and 2.)
The awkwardness of having the line tag comes with normal XML, and has been much debated at coding horror. http://www.codinghorror.com/blog/archives/001139.html
XML 规范对 空白 和 特别是换行符和回车符。 因此,如果您将自己限制为真正的换行符(x0A),那么应该没问题。 然而,许多编辑工具将重新格式化 XML 以“更好地表示”,并可能消除特殊语法。 比“ ”更强大、更简洁的方法 想法是简单地使用名称空间并嵌入 XHTML 内容,例如:
对于标准词汇表,无需重新发明轮子。
The XML spec has something to say regarding whitespace and linefeeds and carriage returns in particular. So if you limit yourself to true linefeeds (x0A) you should be Ok. However, many editing tools will reformat XML for "better presentation" and possibly get rid of the special syntax. A more robust and cleaner approach than the "< line>< / line>" idea would be to simply use namespaces and embed XHTML content, e.g.:
No need to reinvent the wheel when it comes to standard vocabularies.
依赖换行符通常被认为是不好的做法,因为这是一种区分数据的脆弱方法。 虽然大多数 XML 处理器会保留您在 XML 中放入的所有空格,但这并不能保证。
真正的问题是,大多数将 XML 输出为可读格式的应用程序都将 XML 中的所有空格视为可互换的,并且可能会将这些换行符折叠成单个空格。 这就是为什么您的 XSLT 必须跳过这些障碍才能正确呈现数据。 使用“br”标签将大大简化转换。
另一个潜在的问题是,如果您在 XML 编辑器中打开 XML 文档并对其进行漂亮打印,则可能会丢失这些换行符。
如果您继续使用换行符,请确保向“address”添加 xml:space="preserve" 属性。 (如果您正在使用 DTD,则可以在 DTD 中执行此操作。)
一些建议阅读
It's generally considered bad practice to rely on linebreaks, since it's a fragile way to differentiate data. While most XML processors will preserve any whitespace you put in your XML, it's not guaranteed.
The real problem is that most applications that output your XML into a readable format consider all whitespace in an XML interchangable, and might collapse those linebreaks into a single space. That's why your XSLT has to jump through such hoops to render the data properly. Using a "br" tag would vastly simplify the transform.
Another potential problem is that if you open up your XML document in an XML editor and pretty-print it, you're likely to lose those line breaks.
If you do keep using linebreaks, make sure add an xml:space="preserve" attribute to "address." (You can do this in your DTD, if you're using one.)
Some suggested reading