为什么需要 Markdown?
为什么我需要一个 Markdown 以及像 WMD? Markdown 对从 WMD 编辑器发送的内容有何作用?
Markdown 后端如何存储内容?它与 *bold*
或其他格式相同吗?为什么我不能只进行 html 编码?
抱歉,如果我听起来很天真。
Why do I need a Markdown with a front edit editor like WMD? What does the markdown do to the content that’s sent from the WMD editor?
How does Markdown store the content in the backend? Is it the same way like *bold*
or in some other format? Why can’t I just do an html encode?
Sorry if I sounded very naïve.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
退一步问一些更大的问题可能会有所帮助。 Markdown 试图解决的问题是浏览器中的丰富编辑。考虑一下:在某些时候,对于任何支持富文本的软件来说,它都必须以某种方式描述丰富性,无论如何。
我们可以将这种丰富性描述称为“丰富性描述”(我所说的丰富性描述是指“这段文本是粗体”或“这段文本是超链接”),我们可以将这种丰富性描述称为“标记”——它标记了具有元“丰富性”的文本。
富文本的实现可以采用两种方法,a.)向用户隐藏标记或b.)让他们能够访问标记,
对于那些选择隐藏它的人来说,最终结果是。用户通常不知道幕后发生的事情,以 MS Word 为例,没有人会像普通最终用户那样操作 Word 标记格式
。暴露标记,标记语言是为了允许用户与之交互,这样的标记语言可以是像 HTML 那样的
或 BB 代码,例如,做之类的事情。 >[tag]
是其中一种语言,
与我提到的前一种类型相反,Markdown 尝试进行自身设计,以便标记呈现人们已经使用的常见 ASCII。例如,人们通常在文本中加星号来设置它,
*important*
,而 Markdown 中的这种表示法是斜体的指示符。在存储方面,正如 Stephan 指出的那样,系统很可能会存储原始 Markdown,因为用户很可能需要具有编辑的可能性,并且可以为此目的调用原始 Markdown。
在我构建的大多数系统中,我存储 markdown,然后将其规范化为第二个字段,该字段缓存 markdown 的 HTML 渲染。这样我就不必为每个 markdown 字段都进行 markdown->HTML 渲染。它需要更多的空间,但我希望用户有更快的响应,而不是使用更少的数据库存储空间。
从浏览器接受 Markdown 时也应该小心,因为它很容易包含需要过滤掉的
标签。大多数 Markdown 实现还会识别与 Markdown 格式混合的 HTML,因此为了安全起见,您需要确保您的输入和缓存得到正确清理。
It's probably helpful to take a step back and ask some of the larger questions. The issue Markdown is trying to solve is that of rich editing in the browser. Consider this: At some point, for any piece of software to enable rich text it has to describe the richness in a some manner, however that may be.
We could call that description of richness (by description of richness I mean like "this bit of text is bold" or "this bit of text is a hyperlink), we could call that description of richness "markup" -- it marks up the text with meta "richness".
Implementations of rich text can take on two approaches, either a.) hide the markup from the user or b.) let them have access to the markup.
For those who choose to hide it, the end result is very often WYSIWYG. The user is oblivious to what is happening behind the scenes. The editor takes care of the details. Think MS Word as an example. No one manipulates the Word markup format as a regular end user.
For implementations which choose to expose the markup, a markup language is then in order to allow users to interacat with it. Such markup languages would be things like HTML doing
<tag>
or BB code for example, doing things like[tag]
.Markdown is one such of these languages.
As opposed to the former types I mentioned, Markdown has tried to design itself so that the markup renders common ASCII people already use. For example, it's common for people to asterisk their text to set it off,
*important*
, and this notation in Markdown is an indicator of italic.In regards to storage, as Stephan pointed out, the system will most likely store the raw markdown, because the user will most likely need to have the possibility of editing, and the original markdown can be recalled for that purpose.
In most of the systems I've built, I store the markdown, and then normalize it to a 2nd field which caches the HTML rendering of the markdown. This way I don't have to do markdown->HTML rendering for every markdown field. It takes a little more space, but I'd rather the user have a faster response than use less DB storage space.
Care should also be taken when accepting Markdown from the browser, as it can easily contain
<script>
tags which need to be filtered out. Most markdown implementations will also recognize HTML intermingled with Markdown formatting, as so to be safe, you need to make sure your inputs and caches are sanitized properly.使用 HTML 以外的替代编码系统的原因是出于安全考虑
Markdown 和其他此类 wiki 风格的编码系统通常不支持脚本语言
HTML 在很多方面支持脚本语言(
两个主要安全问题是:
恶意软件犯罪分子使用用户生成的内容中的脚本,通过编写脚本访问已知的安全漏洞,尝试在内容阅读器计算机上执行恶意软件操作
免费加载程序使用脚本通过更改内容框架或样式(即广告、菜单、徽标等)来破坏网站的其余部分。这也可能是犯罪行为烦人的
。对呈现的输出进行完全控制
过滤 HTML 是可能的,但也很复杂且有风险
使用替代编码系统的另一个重要原因是样式的强制执行。普通 HTML 有太多选项。通过限制可用选项,用户只能使用某些样式。通常会使内容看起来更干净、更具可读性(与 Ebay 相比)
The reason for using an alternate encoding system other than HTML is for security
Markdown and other such wiki style encoding systems do not usually support scripting languages
HTML supports scripting languages in many ways (
The two main security issues are:
Malware criminals use scripts in user generated content to attempt malware actions on the content readers computer by scripting to access known security holes
Free loaders using scripts to subvert the rest of the site by changing the content frame or styles i.e. ads, menu's, logos etc. This can also be criminal behaviour if not just annoying
By using an intermediate language such as Markdown you have total control on the rendered output
Filtering HTML is possible, but is also complex and risky
The other significant reason for an alternate encoding system is enforcement of style. Normal HTML has too many options. By limiting the available options, users can only use certain styles. The usually makes for cleaner looking and more readable content (compare SO to Ebay)
使用 Markdown 的主要原因是标记文本的可读性。例如,您可以通过纯文本电子邮件发送,读者仍然会理解强调、项目符号、文本将分为段落等。
当您询问存储数据时,这取决于情况。如果您在 WordPress 博客引擎中启用 Markdown,它将在用户输入时将数据存储在 Markdown 中。然而,在 Stack Overflow 中,数据似乎存储为 HTML。至少,“Stack Overflow 数据转储”包含 HTML,而不是 Markdown(我见过 人们抱怨)他们必须将其转换回来)。
如果您使用 WMD 编辑器,您可以向用户展示输出转换为 HTML 后的样子。尽管 Markdown 语法非常简单,但犯错误并不难。因此,最好向用户展示输出。
使用 Markdown 而不是 WYSIWIG 控件的另一个原因 - WYSIWIG 控件允许用户在网页上显示的数据中使用 HTML。因此,您必须能够决定什么时候 HTML 是完全不正确的,什么时候它是邪恶的 XSS/CSRF/任何注入。在 Markdown 中,您只需将 *something* 转换为
something
,删除任何未知的 HTML 元素即可完成。The main reason for using Markdown is the readability of a marked text. For instance, you can send it in a plain-text email and the reader will still understand the emphiasis, bullets, the text will be divided in paragraphs et cetera.
When you ask about storing data, it depends. If you enable Markdown in the WordPress blog engine, it stores data as the user has input it - in Markdown. In Stack Overflow, however, it seems like the data is stored as HTML. At least, the "Stack Overflow data dumps" contain HTML, not Markdown (I've seen people complaining) that they have to convert it back).
If you use the WMD editor, you can show the user how the outputs will look like after being converted to HTML. Even though Markdown syntax is really simple, it is not hard to make mistakes. Hence, it is best to show users the output.
Another reason for using Markdown instead of a WYSIWIG control - a WYSIWIG control allows the user to use HTML in data you are displaying on your web page. So, you have to be the one who decides when there is simply incorrect HTML and when it is an evil XSS/CSRF/whatever injection. In Markdown, you simply convert *something* to
<b>something</b>
, remove any unknow HTML elements and you're done.